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Chapter 1. Review of Thermodynamics 

This chapter starts from a brief discussion of the subject of statistical physics and thermodynamics, and 
the relation between these two disciplines. Then I proceed to a review of the basic notions and relations 
of thermodynamics. Most of this material is supposed to be known to the reader from his or her 
undergraduate studies , 1 so the discussion is rather brief. 


1.1. Introduction: Statistical physics and thermodynamics 

Statistical physics (alternatively called “statistical mechanics”) and thermodynamics are two 
different approaches to the same goal: a description of internal dynamics of large physical systems, 
notably those consisting of many, TV » 1, identical particles - or other components. The traditional 
example of such a system is a human-scale portion of a gas, with the number N of molecules of the order 
of the Avogadro number Na ~ 10“ . 2 The “internal dynamics” is an (admittedly loose) term meaning all 
the physics unrelated to the motion of the system as a whole. The most important example of the internal 
dynamics is the thermal motion of atoms and molecules. 

The motivation for the statistical approach to such systems is straightforward: even if the laws 
governing the dynamics of each particle and their interactions were exactly known, and we had infinite 
computing resources at our disposal, calculating the exact evolution of the system in time would be 
impossible, at least because it is completely impracticable to measure the exact initial state each 
component, e.g., the initial position and velocity of each particle. The situation is further exacerbated by 
the phenomena of chaos and turbulence, 3 and the quantum-mechanical uncertainty, 4 which do not allow 
the exact calculation of final positions and velocities of the component particles even if their initial state 
is known with the best possible precision. As a result, in most situations only statistical predictions 
about behavior of such systems may be made, with the probability theory becoming a major part of the 
mathematical tool arsenal. 

However, the statistical approach is not as bad as it may look. Indeed, it is almost self-evident 
that any measurable macroscopic variable characterizing a stationary system of N » 1 particles as a 
whole (think, e.g., about pressure P of a gas contained in a fixed volume V) is almost constant in time. 
Indeed, we will see below that, besides certain exotic exceptions, the relative fluctuations - either in 
time, or among macroscopically similar systems - of such a variable are of the order of 1 Hn, i.e. for N ~ 
Na are extremely small. As a result, the average values of macroscopic variables may characterize the 
state of the system rather well. Their calculation is the main task of statistical physics. (Though the 
analysis of fluctuations is also an important task, but due to the fluctuation smallness, the analysis in 
most cases may be based on perturbative approaches - see Chapter 5.) 


1 For remedial reading, I can recommend, for example (in the alphabetical order): C. Kittel and H. Kroemer, 
Thermal Physics, 2 nd ed., W. H. Freeman (1980); F. Reif, Fundamentals of Statistical and Thermal Physics, 
Waveland (2008); D. V. Schroeder, Introduction to Thermal Physics, Addison Wesley (1999). 

2 See, e.g., Sec. 4 below. (Note that in these notes, the chapter number is dropped in references to figures, 
formulas, and sections within the same chapter.) 

3 See, e.g., CM Chapters 8 and 9. (Acronyms CM, EM, and QM refer to other of my lecture note series.) 

4 See, e.g., QM Chapter 1. 
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Let us have a look at typical macroscopic variables the statistical physics and thennodynamics 
should operate with. Since I have already mentioned pressure P and volume V, let me start with this 
famous pair. First of all, note that volume is an extensive variable, i.e. a variable whose value for a 
system consisting of several non-interacting (or weakly interacting) parts is the sum of those of its parts. 

On the other hand, pressure is an example of intensive variables whose value is the same for different 
parts of a system - if they are in equilibrium. In order to understand why P and V form a natural pair of 
variables, let us consider the classical playground of thermodynamics, a portion of a gas contained in a 
cylinder, closed with a movable piston of area A (Fig. 1). Neglecting friction between the walls and the 
piston, and assuming that it is being moved slowly enough (so that the pressure P, at any instant, is 
virtually the same for all parts of the volume), the elementary work of the external force f = -PA, 
compressing the gas, at a small piston displacement dx, is 

Work 

(1.1) on a gas 


It is clear that the last expression is more general than the model shown in Fig. 1, and does not depend 
on the particular shape of the system surface. 5 



f 


Fig. 1.1. Compressing a gas. 

< 

x 

From the point of analytical mechanics, 6 V and (-P) is just one of many possible canonical pairs 
of generalized coordinates qj and generalized forces fj, whose products d'%j= -fjdqj give contribution to 
the total work of the environment on the system under analysis. For example, the reader familiar with 
the basics of electromagnetism knows that the elementary work of an electric field on a unit volume of a 
media is 7 

3 

d'U) = • dv = i d ® j , (i .2) 

j = i 

so that the role of generalized coordinates is played by Cartesian components of the electric 
displacement ©, while the components of the electric field 3 serve as the corresponding generalized 
forces. Similarly, the elementary work of the magnetic field #is 8 




1 




5 In order to prove that, it is sufficient to integrate the scalar product d^ = df-dr, with df = -Pnd 2 r, where dr is 
the surface displacement vector (see, e.g., CM Sec. 7.1), and n is the outer normal, over the surface. 

6 See, e.g., CM Chapters 2 and 10. 

7 See, e.g., EM Eq. (3.82). 

8 See, e.g., EM Eq. (5.128). Note that Eqs. (2)-(3) are in SI units (used throughout this lecture series). In the 
Gaussian units, the right-hand parts of these relations have additional coefficients 1/4 n. 
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3 

clrt = ft-d3 = Y J W j d3 j ■> (1.3) 

7=1 

where 3 is the magnetic induction. This list may be extended to other interactions (such as gravitation, 
surface tension in fluids, etc.). Following tradition, I will use the {-P , V } pair in almost all the formulas 
below, as well in most instances but the reader should remember that they all are valid for any other 
pair {?j, qj). 

Again, the specific relations between the variables of each pair listed above are typically affected 
by the statistics of the components (particles) of a body, but their definition is not based on statistics. 
The situation is very different for a very specific pair of variables, temperature T and entropy S, 
although these “sister variables” participate in many formulas of thermodynamics exactly like one more 
canonical pair {f, qj}. However, the very existence of these two variables is due to statistics. 
Temperature T is an intensive variable that characterizes the degree of thermal “agitation” of system 
components. On the contrary, entropy S is an extensive variable that in most cases evades immediate 
perception by human senses; it is a qualitative measure of disorder of the system, i.e. the degree of our 
ignorance about its exact microscopic state. 9 

The reason for the appearance of the {T, S} pair of variables in formulas is that the statistical 
approach to large systems of particles brings some qualitatively new results, most notably the notion of 
irreversible time evolution of collective ( macroscopic ) variables describing the system. On one hand, 
such irreversibility looks absolutely natural in such phenomena as the diffusion of an ink drop in a glass 
of water. In the beginning, the ink molecules are located in a certain small part of system’s volume, i.e. 
to some extent ordered, while at the late stages of diffusion, the position of each molecule is essentially 
random. However, as a second thought, the irreversibility is rather surprising, 10 taking into account that 
it takes place even if the laws governing the motion of system’s components are time-reversible - such 
as the Newton laws or the basic laws of quantum mechanics. Indeed, if, at a late stage of the diffusion 
process, we exactly reversed the velocities of all molecules simultaneously, the ink molecules would 
again gather (for a moment) into the original spot. 11 The problem is that getting the information 
necessary for the exact velocity reversal is not practicable. This example shows a deep connection 
between the statistical mechanics and the information theory. 

A qualitative discussion of the reversibility-irreversibility dilemma requires a strict definition of 
the basic notion of statistical mechanics (and indeed the probability theory), the statistical ensemble, and 
I would like to postpone it until the beginning of Chapter 2. In particular, in that chapter we will see that 
the basic law of irreversible behavior is the increase of entropy S in any closed system. Thus, statistical 
mechanics, without defying the “microscopic” laws governing evolution of system’s components, 


9 The notion of entropy was introduced into thermodynamics in the 1850s by R. Clausius, on the background of 
an earlier pioneering work by S. Carnot (see Sec. 7 below), as a variable related to “useful thermal energy” rather 
than a measure of disorder. In the absence of any clue of entropy’s microscopic origins (which had to wait for 
decades until the works by L. Boltzmann and J. Maxwell), this was an amazing intellectual achievement. 

10 Indeed, as recently as in the late XIX century, the very possibility of irreversible macroscopic behavior of 
microscopically reversible systems was questioned by some serious scientists, notably by J. Loschmidt in 1876. 

1 1 While quantum- mechanical effects, with their intrinsic uncertainty, are quantitatively important in this example, 
our qualitative discussion does not depend on them. A good example is the chaotic, but classical motion of a 
billiard ball on a 2D Sinai table - see CM Fig. 9.8. 
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introduces on top of them some new “macroscopic” laws, intrinsically related to the evolution of 
information, i.e. the degree of our knowledge of the microscopic state of the system. 

To conclude this brief discussion of variables, let me mention that as in all fields of physics, a 
very special role in statistical mechanics is played by energy E. In order to emphasize the commitment 
to disregard the motion of the system as a whole, in thermodynamics it is frequently called the internal 
energy, though for brevity, I will mostly skip the adjective. Its simplest example is the kinetic energy of 
the thermal motion of molecules in a dilute gas, but in general E also includes not only the individual 
energies of all system’s components, but also their interactions with each other. Besides a few 
pathological cases of very-long-range interactions (such as the Coulomb interactions in plasma with 
uncompensated charge density), the interactions may be treated as local; in this case the internal energy 
is proportional to N, i.e. is an extensive variable. As will be shown below, other extensive variables with 
the dimension of energy are often very useful, including the (Helmholtz) free energy F, the Gibbs 
energy G, enthalpy H, and grand potential Q. (The collective name for such variables is thermodynamic 
potentials .) 

Now, we are ready for a brief discussion of the relation between statistical physics and 
thermodynamics. While the task of statistical physics is to calculate the macroscopic variables discussed 
above, 12 using this or that particular microscopic model of the system, the main role of thermodynamics 
is to derive some general relations between the average values of the macroscopic variables (called 
thermodynamic variables ) that do not depend on specific models. Surprisingly, it is possible to 
accomplish such a feat using a few either evident or very plausible general assumptions (sometimes 
called the laws of thermodynamics ), which find their proof in statistical physics. 13 Such general relations 
allow us to reduce rather substantially the amount of calculations we have to do in statistical physics; in 
many cases it is sufficient to calculate from statistics just one or two variables, and then use 
thermodynamic relations to calculate all other properties of interest. Thus the thermodynamics, 
sometimes snubbed at as a phenomenology, deserves every respect not only as a discipline which is, in a 
certain sense, more general than statistical physics as such, but also as a very useful science. This is why 
the balance of this chapter is devoted to a brief review of thermodynamics. 


1.2. The 2 nd law of thermodynamics, entropy, and temperature 


Thermodynamics accepts a phenomenological approach to entropy S, postulating that there is 
such a unique extensive measure of disorder, and that in a closed system, 14 it may only grow in time, 
reaching its constant (maximum) value at equilibrium: 15 


dS> 0. 

This postulate is called the 2 nd law of thermodynamics - arguably its only substantial new law. 


(1.4) 


2“ law of 
thermo- 
dynamics 


12 Several other quantities, for example the heat capacity C, may be obtained as partial derivatives of the basic 
variables discussed above. Also, at certain conditions, the number of particles N in the system is not fixed and 
may be also considered as an (extensive) variable. 

13 Admittedly, some of these proofs are based on other (but deeper) postulates, for example the central statistical 
hypothesis - see Sec. 2.2. 

14 Defined as a system completely isolated from the environment, i.e. the system with its internal energy fixed. 

15 Implicitly, this statement also postulates the existence, in a closed system, of thermodynamic equilibrium, an 
asymptotically reached state in which all thermodynamic variables, including entropy, remain constant. 
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Surprisingly, this law, together with the additivity of S in composite systems of non-interacting 
parts (as an extensive variable), is sufficient for a formal definition of temperature, and a derivation of 
its basic properties that comply with our everyday notion of this variable. Indeed, let us consider a 
particular case: a closed system consisting of two fixed-volume subsystems (Fig. 2) whose internal 
relaxation is very fast in comparison with the rate of the thermal flow (i.e. the energy and entropy 
exchange) between the parts. In this case, on the latter time scale, each part is always in some quasi- 
equilibrium state, which may be described by a unique relation E(S ) between its energy and entropy. 16 



. dE , dS 

e 2 ,s 2 



Fig. 1.2. Composite thermodynamic system. 


Neglecting the interaction energy between the parts (which is always possible at N» 1, in the 
absence of long-range interactions), we may use the extensive character of variables E and S to write 

E = E X {S X )+E 2 {S 2 ), S = S x +S 2 , (1.5) 


for the full energy and entropy of the system. Now let us calculate the following derivative: 

dS dS x dS 2 dS x dS 2 dE 2 dS x dS 2 d(E-E x ) 

dE x dE x dE x dE x dE 2 dE x dE x dE 2 dE x 


Since the total energy E of the system is fixed and hence independent of its re-distribution 
between the sub-systems, dE!dE\ =0, and we get 


dS _ dS x dS 2 
dE x dE x dE 2 

According to the 2 nd law of thermodynamics, when the two parts reach the thermodynamic equilibrium, 
the total entropy S reaches its maximum, so that dS!dE\ = 0, and Eq. (7) yields 


d S j dS 2 

dE x dE 2 


(1.8) 


Thus we see that if a thennodynamic system may be partitioned into weakly interacting 
macroscopic parts, their derivatives dS/dE should be equal in the equilibrium. The reciprocal of such 
derivative is called temperature. Taking into account that our analysis pertains to the situation (Fig. 2) 
when both volumes V\^ are fixed, we may write this definition as 

Definition of 
temperature 



16 Flere we strongly depend on a very important (and possibly the least intuitive) aspect of the 2 nd law, namely that 
the entropy is the unique measure of disorder, i.e. its only measure which may affect the system’s energy, or any 
other thermodynamic variable. 
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subscript V meaning that volume is kept constant at differentiation. (Such notation is common and very 
useful in thermodynamics, with its broad range of variables.) 

Note that according to Eq. (9), if temperature is measured in energy units (as I will do in this 
course for the brevity of notation), S is dimensionless. 17 The transfer to the SI or Gaussian units, i.e. to 
temperature 7 k measured in kelvins (not “Kelvins”, not “degrees Kelvin”, please!), is given by relation 
T = k B T\ c, where the Boltzmann constant k B ~ 1 .3 8 x ] 0‘ 2 ’ J/K = 1.38xl0' 16 erg/K. 18 In these units, the 
entropy becomes dimensional: Sk = k B S. 

The definition of temperature, given by Eq. (9), is of course in a sharp contract with the popular 
notion of T as a measure of the average energy per particle. However, as we will repeatedly see below, 
is most cases these two notions may be reconciled. In particular, let us list some properties of T, which 
are in accordance with our everyday notion of temperature: 

(i) according to Eq. (9), temperature is an intensive variable (since both E and S are extensive), 
i.e., in a system of similar particles, independent of the particle number N; 

(ii) temperatures of all parts of a system are equal at equilibrium - see Eq. (8); 

(iii) in a closed system whose parts are not in equilibrium, thermal energy (heat) always flows 
from a warmer part (with higher T) to the colder part. 

In order to prove the last property, let us come back to the closed, composite system shown in 
Fig. 2, and consider another derivative: 

dS dS , dS n dS , dE , dS 2 dE 2 

— = — - + — - = — - + — -. (1-10) 

dt dt dt dE l dt dE 2 dt 

If the internal state of each part is very close to equilibrium (as was assumed from the very beginning) at 

each moment of time, we can use Eq. (9) to replace derivatives dS^ 2 /dE\y for l/7 l-2 and get 

dS 1 dE, 1 dE 2 ... 

dt T ] dt T 2 dt 


Since in a closed system E = E\ + E 2 = const, these time derivatives are related as dE 2 /dt = -dE\/dt, and 
Eq. (11) yields 


dS_ 

dt 



_n 

T i) 


dE x 

dt 


( 1 . 12 ) 


But in accordance with the 2 nd law of thermodynamics, the derivative cannot be negative: dS/dt > 0. 
Hence, 


1 1 




dE x 


[ 2 j 


dt 


> 0 . 


(1.13) 


17 Here I have to mention a traditional unit of energy, still used in some fields related to thermodynamics: the 
calorie', in the most common definition (the so-called thermochemical calorie) it equals exactly 4.148 J. 

18 For more exact value of this and other constants, see appendix CA: Selected Physical Constants. Note that both 
T and 7k define the absolute (also called “thermodynamic”) scale of temperature, in contrast to such artificial 
temperature scales as degrees Celsius (“centigrades”), defined as T c = T K + 273.15, or degrees Fahrenheit: T v = 
(9/5)77 + 32. 
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For example, if T\ > 7? (i.e. 1/7) < \/T 2 ), then dEJdt < 0, i.e. the warmer part gives energy to its colder 
counterpart. 

Note also that at such a heat exchange, at fixed volumes V\ ,2, and T\ ^ T 2 , increases the total 
system entropy, without perfonning any “useful” mechanical work. 


1.3. The 1 st and 3 rd laws of thermodynamics, and heat capacity 

Now let us consider a thermally insulated system whose volume V may be changed by a 
deterministic force - see, for example, Fig. 1. Such system is different from the fully closed one, 
because its energy E may be changed by the external force’s work - see Eq. (1): 

dE = dw> = -PdV . (1.14) 


Let the volume change be so slow ( dV/dt — > 0) that the system is virtually at equilibrium at any 
instant without much error. Such a slow process is called reversible, and in this particular case of a 
thermally insulated system, it is also called adiabatic. If pressure P (or any a generalized external force 
fj) is deterministic, i.e. is a predetermined function of time independent on the state of the system under 
analysis, it may be considered as coming from a fully ordered system, i.e. the one having zero entropy, 
with the total system completely closed. Since according to the second of Eqs. (5), the entropy of the 
total closed system should stay constant, S of the system under analysis should stay constant on its own. 
Thus we arrive at a very important conclusion: an adiabatic process, the entropy of a system cannot 
change. 19 This means that we can use Eq. (14) to write 


P = 


' dE ' 

J V)s ' 


(1.15) 


Let us now consider an even more general thermodynamic system that may also exchange 
thennal energy (“heat”) with the environment (Fig. 3). 





Fig. 1.3. General thermodynamic process 
involving both the mechanical work and heat 
exchange with the environment. 


For such a system, our previous conclusion about the entropy constancy is not valid, so that S, in 
equilibrium, may be a function of not only energy E, but also of volume V. Let us resolve this relation 
for energy: E = E(S, V), and write the general mathematical expression for the full differential of E as a 
function of these two independent arguments: 


dE = 


dE 

K~dSj v 


dS + 


' dE ' 

{~dVJs 


dV. 


(1.16) 


19 A general (not necessarily adiabatic) process conserving entropy is sometimes called isentropic. 
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This formula, based on the stationary relation E = E(S, V), is evidently valid not only in 
equilibrium, but also for very slow, reversible 20 processes. Now, using Eqs. (9) and (15), we may rewrite 
Eq. (16) as 


dE = TdS - PdV . 


(1.17) 


The second tenn in the right-hand part of this equation is just the work of the external force, so that due 
to the conservation of energy, 21 the first tenn has to be equal to the heat dQ transferred from the 
environment to the system (see Fig. 3): 

(1.18) 
(1.19) 


dE = dQ + did , 
dQ = TdS . 


Energy: 

differential 


1 st law of 
thermo- 
dynamics 


The last relation, divided by T and then integrated along an arbitrary (but reversible!) process, 



dQ 

+ const, 

T 


(1.20) 


is sometimes used as an alternative definition of entropy S - provided that temperature is defined not by 
Eq. (9), but in some independent way. It is useful to recognize that entropy (like energy) may be defined 
to an arbitrary constant, which does not affect any other thennodynamic observables. The common 
convention is to take 


S^OatT^O. (1.21) 

This condition is sometimes called the 3"' law of thermodynamics, but it is important to realize that this 
is just a convention rather than a real law. 22 Indeed, the convention corresponds well to the notion of the 
full order at T= 0 in some systems (e.g., perfect crystals), but creates ambiguity for other systems, e.g., 
amorphous solids (like the usual glasses) that may remain, for “astronomic” times, highly disordered 
even at T — » 0. 

Now let us discuss the notion of heat capacity that, by definition, is the ratio dQ/dT, where dQ is 
the amount of heat that should be given to a system to raise its temperature by a small amount dT. 23 
(This notion is very important, because it may be most readily measured experimentally.) The heat 
capacity depends, naturally, on whether the heat dQ goes only into an increase of the internal energy dE 


20 Let me emphasize that an adiabatic process is reversible, but not vice versa. 

21 Such conservation, expressed by Eqs. (18)-(19), is sometimes called the 1 st law of thermodynamics. While it (in 
contrast with the 2 nd law) does not present any new law of nature on the top of mechanics, and in particular was 
already used de-facto to write the first of Eqs. (5) and Eq. (14), such grand name was quite justified in the mid- 
19 th century when the mechanical nature of the internal energy (thermal motion) was not at all clear. In this 
context, the names of two great scientists, J. von Mayer (who was first to conjecture the conservation of the sum 
of the thermal and macroscopic mechanical energies in 1841), and J. Joule (who proved the conservation 
experimentally two years later), have to be reverently mentioned. 

22 Actually, the 3 rd law (also called the Nernst theorem) as postulated by W. Nernst in 1912 was different - and 
really meaningful: “It is impossible for any procedure to lead to the isotherm T= 0 in a finite number of steps.” I 
will discuss this postulate in the end of Sec. 6. 

23 By this definition, the full heat capacity of a system is an extensive variable. The capacity per either unit mass 
or per particle (i.e., an intensive variable), is called the specific heat capacity or just the specific heat. Note, 
however, that in some texts, the last term is used for the heat capacity of the system as the whole as well, so that 
some caution is in order. 
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Heat 

capacity 

definitions 


of the system (as it does if volume V is constant), or also into the mechanical work {-drt) that may be 
performed at expansion - as it happens, for example, if pressure P, rather than volume V, is fixed (the so- 
called isobaric process - see Fig. 4). 24 


ft 



> 

r Mg 

dQ /V* 

Aol 

ii 

a. 

= const 


Fig. 1.4. The simplest implementation of 
an isobaric process. 


Hence we should discuss two different quantities, the heat capacity at fixed volume. 


and heat capacity at fixed pressure 


ill 

O 

(dQ) 

IdT J 

V 


Cp- 

(dQ) 

IdT J 

? 

p 


( 1 . 22 ) 


(1.23) 


and expect that for all “normal” (mechanically stable) systems, C P > Cy. The difference between Cp and 
Cy is rather minor for most liquids and solids, but may be very substantial for gases - see Sec. 4. 


1.4. Thermodynamic potentials 


A technical disadvantage of Eqs. (22) and (23) is that 8Q is not a differential of a function of 
state of the system, 25 and hence (in contrast with temperature and pressure) does not allow an immediate 
calculation of heat capacity, even if the relation between E, S, and V is known. For Cy the situation is 
immediately correctable, because at fixed volume, = -PdV = 0 and hence, according to Eq. (18), dQ 

= dE. Hence we may write 


Cp 


8E_ 

ST 


Jv 


(1.24) 


24 A similar duality is possible for other pairs {qj, of generalized coordinates and forces as well. For example, 
if a long sample of a dielectric placed is into a parallel, uniform external electric field, value of field is fixed, i.e. 
does not depend on sample’s polarization. However, if a thin sheet of such material is perpendicular to the field, 
then value of field D is fixed - see, e.g., EM Sec. 3.4. 

25 The same is true for work A, and in some textbooks this fact is emphasized by using a special sign for 
differentials of these variables. I do not do this in my notes, because both d'UJ and dQ are still very much usual 
differentials: for example, d'UJ is the difference between the mechanical work which has been done over our 
system by the end of the infinitesimal interval we are considering, and that done by the beginning of that interval. 
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so that in order to calculate Cy from a certain statistical-physics model, we only need to calculate E as a 
function of temperature and volume. 

If we want to write similarly a convenient expression for Cp, the best way is to introduce a new 
notion of so-called thermodynamic potentials - whose introduction and effective use is perhaps one of 
the most impressive formalisms of thermodynamics. For that, let us combine Eqs. (1) and (18) to write 
the “1 st law of thermodynamics” in its most common form 

dQ = dE + PdV. (1.25) 


At an isobaric process (Fig. 5), i.e. at P = const, this expression is equivalent to 

(dQ) P = dE + d(PV) = d{E + PV) P . 

Thus, if we introduce a new function with the dimensionality of energy: 


H = E + PV, 


(1.26) 


(1.27) 


called enthalpy (or, more rarely, the “heat function” or “heat contents”), 26 we may rewrite Eq. (23) as 

f dH^ 


C P = 


ydT j p 


(1.28) 


Comparing Eq. (28) with (24) we see that for the heat capacity, enthalpy //plays the same role at fixed 
pressure as the internal energy E plays at fixed volume. 

Now let us explore properties of the enthalpy for an arbitrary reversible process, i.e. lifting the 
restriction P = const, but still keeping definition (27). Differentiating it, we get 

dH = dE + PdV + VdP, ( 1 .29) 


so that plugging in Eq. (17) for dE, we see that two terms PdV cancel, yielding a very simple expression 


dH = TdS + VdP. 


(1.30) 


This equation shows that if H has been found (say, experimentally measured or calculated for a certain 
microscopic model) as a function of entropy S and pressure P, we can find temperature T and volume V 
by simple partial differentiation: 


T = 


dll 

~ds 


J p 


V = 


dH 

8P 


J s 


(1.31) 


The comparison of the first of these relations with Eq. (9) shows that not only for the heat capacity, but 
for temperature as well, enthalpy plays the same role at fixed pressure as played by the intrinsic energy 
at fixed volume. Moreover, the comparison of the second of Eqs. (31) with Eq. (15) shows that the 
transfer between E to H corresponds to a simple swap of ( -P ) and V in the expressions for the 
differentials of these variables. 


This success immediately raises the question whether we could develop it further on, by defining 
other useful thermodynamic potentials - variables with the dimensionality of energy that would have 


26 This function (as well as the Gibbs free energy G, see below), had been introduced in 1875 by J. Gibbs, though 
the term “enthalpy” was coined much later by H. Onnes. 


Enthalpy: 

definition 


Enthalpy: 

differential 
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similar properties, first of all a potential which would enable a similar swap of T and S in its full 
differential. We already know that the adiabatic processes is the reversible process with fixed entropy, 
so that now we should analyze a reversible process with fixed temperature. Such isothermal process 
may be implemented, for example, by placing the system under consideration into a thermal contact 
with a much larger system (called either the heat bath, or “heat reservoir”, or “thermostat”) that remains 
in thermodynamic equilibrium at all times - see Fig. 5. 


heat bath 



Fig. 1.5. The simplest 
implementation of an 
isothermal process. 


Due to its large size, the heat bath temperature T does not depend on what is being done with our 
system, and if the change is being done slowly enough (i.e. reversibly), that temperature is also the 
temperature of our system - see Eq. (8) and its discussion. Let us calculate the elementary work d for 
such a reversible isothermal process. According to the general Eq. (18), = dE - dQ. Plugging in dQ 

from Eq. (19), for T= const we get 

{dv>) T = dE - TdS = d{E - TS) = dF, (1 .32) 


where the following combination, 

Free 


energy: 

definition 


F = E - TS , 


(1.33) 


Free 

energy: 

differential 


is called the free energy (or the “Helmholtz free energy”, or just the “Helmholtz energy” 27 ). Just as we 
have done for the enthalpy, let us establish properties of this new thermodynamic potential for an 
arbitrary (not necessarily isothermal) small reversible variation of variables, while keeping definition 
(33). Differentiating this relation and using Eq. (17), we get 


dF = -SdT - PdV. 

Thus, if we know function F(T, V), we can calculate S and P by simple differentiation: 


(1.34) 




f 

, p = - 

UrJ 

V 

{dvj 


(1.35) 


ft is easy to see that we can make the derivative system full and symmetric if we introduce one 
more thermodynamic potential. Indeed, we have shown that each of three already introduced 
thermodynamic potentials (E, H, and F) has especially simple full differential if it is considered a 


27 After H. von Helmholtz (1821-1894). The last term was recommended by the most recent (1988) IUPAC’s 
decision, but I will use the first term, which prevails is physics literature. Its origin may stems from Eq. (32): F is 
may be interpreted as the internal energy part which is “free” to be transferred to mechanical work - at a 
reversible, isothermal process only! 
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function of two canonical arguments: one of “thermal variables” (either S or T) and one of “mechanical 
variables” (either P or F): 28 

E = E(S,V), H = H(S,P), and F = F(T,V). (1.36) 

In this list of pair of 4 arguments, only one pair is missing: ( T , P). The thermodynamic function of this 
pair, which gives two other variables ( S and V) by simple differentiation, is called the Gibbs energy (or 
sometimes the “Gibbs free energy”): G = G(T, P ). The way to define it in a symmetric way is evident 
from the so-called circular diagram shown in Fig. 6. 




(b) 


Fig. 1.6. (a) Circular diagram and 
(b) its use for variable calculation. 
The thermodynamic potentials are 
shown in boldface, each flanked by 
its two canonical arguments. 


In this diagram, each thermodynamic potential is placed between its two canonical arguments - 
see Eq. (36). The left two arrows in Fig. 6a show the way the potentials H and F have been obtained 
from energy E - see Eqs. (27) and (33). This diagram hints that G has to be defined as shown by the 
right two arrows on that panel, i.e. as 


G = E -TS + PV = H -TS = F + PV. 


(1.37) 


In order to verify this idea, let us calculate the full differential of this new potential, using, e.g., the last 
form of Eq. (37) together with Eq. (32): 


dG = dF + d(PV ) = (-SdT - PdV ) + ( PdV + VdP ) = -SdT + VdP, 


(1.38) 


so that if we know the function G(T, P), we can indeed readily calculate entropy and volume: 


f 

v = 


{dr); 




(1.39) 


The circular diagram completed in this way is a good mnemonic tool for describing Eqs. (9), 
(15), (31), (35), and (39), which express thermodynamic variables as partial derivatives of the 
thermodynamic potentials. Indeed, the variable in any corner of the diagram may be found as a 
derivative of any of two potentials that are not its immediate neighbors, over the variable in the opposite 
comer. For example, the red line in Fig. 6b corresponds to the second of Eqs. (39), while the blue line, 
to the second of Eqs. (3 1). At this, the derivatives giving variables of the upper row ( S and P) have to be 


28 Note the similarity of this situation with that is analytical (classical) mechanics (see, e.g., CM Chapters 2 and 
10): the Lagrangian function may be used to get simple equations of motion if it is expressed as a function of 
generalized coordinates and velocities, while is order to use the Hamiltonian function in a similar way, it has to be 
expressed as a function of the generalized coordinates and momenta. 


Gibbs 

energy: 

definition 


Gibbs 

energy: 

differential 
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taken with negative signs, while those giving the variables of the bottom row (V and T), with positive 
signs. 29 

Now I have to justify the collective name “thermodynamic potentials” used for E, H, F, and G. 
For that, let us consider an irreversible process, for example, a direct thermal contact of two bodies with 
different initial temperatures. As we have seen in Sec. 2, at such a process, the entropy may grow even 
without the external heat flow: dS > 0 at dQ = 0 - see Eq. (12). For a more general process with dQ 0, 
this means that entropy may grow faster than predicted by Eq. (19), which has been derived for a 
reversible process, so that 

dS> Xy, (1.40) 


with the equality approached in the reversible limit. Plugging Eq. (40) into Eq. (18) (which, being just 
the energy conservation law, remains valid for irreversible processes as well), we get 

dE < TdS - PdV. (1.41) 


Now we can use this relation to have a look at the behavior of other thermodynamic potentials in 
irreversible situations, still keeping their definitions given by Eqs. (27), (33), and (37). Let us start from 
the (very common) case when both temperature T and volume V are kept constant. If the process was 
reversible, according to Eq. (34), the full time derivative of free energy F would equal zero. Equation 
(41) says that at the irreversible process it is not necessarily so: if dT= dV= 0, then 


lL = ±(E-TS) = ^-T^-<T^-T^- = 0 

dt dt dt dt dt dt 


(1.42) 


Hence, in the general (irreversible) situation, function F can only decrease, but not increase in time. This 
means that F eventually approaches its minimum value F(T, S), which is given by the equations of 
reversible thermodynamics. 

Thus in the case T= const, V = const, the free energy F, i.e. the difference E - TS, plays the role 
of the potential energy in the classical mechanics of dissipative processes: its minimum corresponds to 
the (in the case of F, thermodynamic) equilibrium of the system. This is one of the key results of 
thennodynamics, and I invite the reader to give it some thought. One of its possible handwaving 
interpretations is that the heat bath with fixed T > 0, i.e. with a substantial thennal agitation of its 
components, “wants” to impose thermal disorder in the system immersed in it by “rewarding” it (by 
lowering its F) for any increase of disorder. 

Repeating the calculation for the case T = const, P = const, it is easy to see that in this case the 
same role is played by the Gibbs energy: 


dG_ 

dt 


— ( E-TS + PV ) 
dt 


dE T dS ^ pdV < dS 

dt dt dt dt 


dt dt 



dt 


= 0 , 


(1.43) 


29 There is also a wealth of other relations between thermodynamic variables that may be presented as second 
derivatives of the thermodynamic potentials, including four Maxwell relations such as ( dS/dV)r = ( 8P/dT ) v , etc. 
(They may be readily recovered from the well-known property of a function of two independent arguments, say, 
fix, y): d{dfldx)ldy = d(df/dy)/dx.) In this chapter, I will list only the thermodynamic relations that will be used 
later in the course; a more complete list may found, e.g., in Sec. 16 of the textbook by L. Landau and E. Lifshitz, 
Statistical Physics, Part 1, 3 rd ed., Pergamon, 1980 (and later its re-printings). 
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so that the thermal equilibrium now corresponds to the minimum of G rather than F. One can argue very 
convincingly that the difference, G - F = PV between these two potentials (also equal t o FI- E) has very 
little to do with thermodynamics at all, because this difference exists (although not much advertised) in 
classical mechanics as well. 30 Indeed, the difference may be generalized as G - F = - /fqj, where qj is 
any generalized coordinate and f is the corresponding generalized force - see Eq. (1) and its discussion. 

In this case the minimum of F corresponds to the equilibrium of an autonomous system (with f = 0), 
while the equilibrium position of the same system under the action of external force /f is given by the 
minimum of G. Thus the external force “wants” the system to subdue to its effect, “rewarding” it by 
lowering its G. (The analogy with the “disorder pressure” by a heat bath, discussed in the last paragraph, 
is evident.) 

For two remaining thermodynamic potentials, E and H, the calculations similar to Eqs. (42) and 
(43) make less sense, because that would require taking S = const (with V = const for E, and P = const 
for FT), but it is hard to prevent the entropy from growing if initially it had been lower than its 
equilibrium value, at least on the long-tenn basis. 31 Thus the circular diagram is not so symmetric after 
all: G and/or F are somewhat more useful for most practical calculations than E and H. 

One more important conceptual question is why the main task of statistical physics should be the 
calculation of thermodynamic potentials, rather than just a relation between P, V, and T. (Such relation 
is called the equation of state of the system.) Let us explore this issue on the example of an ideal 
classical gas in thermodynamic equilibrium, for which the equation of state should be well known to the 
reader from undergraduate physics (in Chapter 3, we will be derived from statistics): 

Ideal gas 
(1.44) equation 
of state 

where N is the number of particles in volume V? 2 Let us try to use it for the calculation of all 
thermodynamic potentials, and all other thermodynamic variables discussed above. We may start, for 
example, from the calculation of the free energy F. Indeed, solving Eq. (44) for pressure, P = NT/ V, and 
integrating the second of Eqs. (35), we get 

F = -\PdV\ T =-NT\— = -NT\ d(VIN ^ =-NTln — + Nf(T), (1.45) 

J |r i V J IV IN) N 

where I have divided Fby /V in both instances just to present F as a manifestly extensive variable, in this 
uniform system proportional to N. The integration “constant” f(T) is some function of temperature that 
cannot be recovered from the equation of state. This function also affects all other thennodynamic 
potentials, and entropy. Indeed, using the first of Eqs. (35) together with Eq. (45), we get 


PV = NT, 


30 See, e.g., CM Sec. 1.5. 

31 There are a few practical systems, notably including the so-called magnetic refrigerators (to be discussed in 
Chapter 4), when the natural growth of S is so slow that the condition S = const may be closely approached. 

32 This equation was first derived from experimental data by E. Clapeyron (in 1834) in the form PV = nRT K , 
where n is the number of moles in the gas sample, and R ~ 8.3 1 J/mole-K is the so-called gas constant. This form 
is equivalent to Eq. (44), taking into account that R = k K N A , where /V A « 6.02x1 0 23 mole' 1 is the so-called 
Avogadro number, i.e. the number of molecules per mole. (By definition of the mole, /V A is just the reciprocal 
mass, in grams, of a baryon - more exactly, by convention, of a 1/1 2 th part of the carbon-12 atom.) 
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s = - 

(dF > 

= N 

[i F— d A 


Urj 

V 

N dT 


(1.46) 


and now can combine Eqs. (33) and (46) to calculate the (internal) energy, 


E = F + TS = 


- NT In — + Nf 
N 


+ T 


AM V V d f 

TV In N — 

N dT 


= N 


f-T#-' 

dT , 


(1.47) 


then use Eqs. (27), (44) and (47) to calculate enthalpy, 


H = E + PV = E + NT = N\ 


f-T^- + T 
dT 


and, finally, plug Eqs. (44), and (45) into Eq. (37) to calculate the Gibbs energy 


G = F + PV = N\ 


-T\n — + f + T 
N 


(1.48) 


(1.49) 


In particular, Eq. (47) describes a very important property of the ideal classical gas: its energy 
depends only on temperature, but not on volume or pressure. One might question whether function ^(7) 
may be physically insignificant, just like the arbitrary constant that may be always added to the potential 
energy in nonrelativistic mechanics. In order to address this concern, let us calculate, from Eqs. (24) and 
(28), both heat capacities, that are readily measurable quantities: 


Cy = 


dE 

\dT Jy 


= -NT 


d 2 f 

dT 2 


(1.50) 


c P = 


dH 

dT 


= N 


J P 


d 2 f A 
-T — 4 + i 
dT 2 


= Cy+N. 


(1.51) 


We see that function ^(7), or at least its second derivative, is measurable. 33 (In Chapter 3, we will 
calculate this function for two simple “microscopic” models of the ideal classical gas.) The meaning of 
this function is evident from the physical picture of the ideal gas: pressure P exerted on the walls of the 
containing volume is produced only by the translational motion of the gas molecules, while their 
internal energy E (and hence other thermodynamic potentials) may be also contributed by the internal 
motion of the molecules - their rotations, vibrations, etc. Thus, the equation of state does not give the 
full thermodynamic description of a system, while the thermodynamic potentials do. 


1.5. Systems with variable number of particles 

Now we have to consider one more important case when the number N of particles in a system is 
not rigidly fixed, but may change as a result of a thermodynamic process. Typical examples of such a 
system is a gas sample separated from the environment by a penetrable partition (Fig. 7), and a gas in a 
contact with the liquid of the same molecules. 


33 Note, however, that the difference C P - Cy = N (if temperature is measured in kelvins, C P - Cy = nR) is 
independent of/(7). (It is possible to show that the difference C P - Cy is fully determined by the equation of state 
for any medium.) 
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environment 


Fig. 1.7. Example of a system with variable 
number of particles. 



Let us analyze this situation for the simplest case when all the particles are similar (though the 
analysis may be readily extended to systems with particle of several sorts). In this case we can consider 
N as an independent thermodynamic variable whose variation may change energy E of the system, so 
that (for a slow, reversible process) Eq. (17) should be now generalized as 


dE = TdS - PdV + judN, 


(1.52) 


Chemical 

potential: 

definition 


where p is a new function of state, called the chemical potential , 34 Keeping the definitions of other 
thermodynamic potentials, given by Eqs. (27), (33), and (37) intact, we see that expressions for their 
differentials should be generalized as 


dH = TdS + VdP + pdN, (1.53a) 

dF = -SdT - PdV + pdN, ( 1.53b) 

dG = -SdT + VdP + pdN, (1.53c) 


so that the chemical potential may be calculated as either of the following derivatives: 


35 




dE 

dN 


J s,v 


dH ' 
dN 


J S,P 


' dF ' 
dN 


Jt,v 


'dG 

dN 


J T,P 


(1.54) 


Despite their similarity, one of Eqs. (53)-(54) is more consequential than the others. Indeed, the 
Gibbs energy G is the only thermodynamic potential that is a function of two intensive parameters, T 
and P. However, as all thermodynamic potentials, G has to be extensive, so that in a system of similar 
particles it has to be proportional to N: 


G = Nf( T, P ). 


(1.55) 


Plugging this expression into the last of Eqs. (54), we see that p equal s/( EE). In other words, 


G 



so that the chemical potential is just the Gibbs energy per particle. 


(1.56) 


Chemical 
potential 
as the 
Gibbs 
energy 


34 This name, of a historic origin, is a bit misleading: as evident from Eq. (52), p has a clear physical sense of the 
average energy cost of adding one more particle to the system ofN » 1 particles. 

35 Note that strictly speaking, Eqs. (9), (15), (31), (35) and (39) should be now generalized by adding one more 
lower index, N, to the corresponding derivatives. 
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Grand 
potential: 
definition 
and full 
differential 


In order to demonstrate how vital the notion of chemical potential may be, let us consider the 
situation (parallel to that shown in Fig. 2) when a system consists of two parts, with equal pressure and 
temperature, that can exchange particles at a relatively slow rate (much slower than the speed of internal 
relaxation inside each of the parts). Then we can write two equations similar to Eq. (5): 

N = N l +N 2 , G = G l +G 2 , (1.57) 

where N = const, and Eq. (56) may be used to describe each component of G : 

G = ju l N 1 + p 2 N 2 . (1.58) 

Plugging N 2 expressed from the first of Eqs. (57), N 2 = N-Ni, into Eq. (58), we see that 

dG n sqi 

= (1-59) 


so that the minimum of G is achieved at p\ = pi. Hence, in the conditions of fixed temperature and 
pressure, i.e. when G is the appropriate thermodynamic potential, the chemical potentials of the system 
parts should be equal - the so-called chemical equilibrium. 


Later we will also run into cases when volume V of a system, its temperature T, and the chemical 
potential ju are all fixed. (The last condition may be readily implemented by allowing the system of 
interest to exchange particles with a reservoir so large that its /u stays constant.) A thermodynamic 
potential appropriate for this case may be obtained from the free energy F by subtraction of the product 
juN, resulting is the so-called grand thermodynamic potential (or the “Landau potential”) 


O = F - uN = F N = F -G = -PV . 

N 


(1.60) 


Indeed, for a reversible process, the full differential of this potential is 


dFl - dF - d(pN) = ( -SdT - PdV + pdN) - ( judN + Ndp) = -SdT - PdV - Ndp , 


(1.61) 


so that if Q has been calculated as a function of T, V and //, other thermodynamic variables may be 
found as 


S 


' 80 ' 

Jr, 


v,n 


p 


' do ' 

Jv , 


T,M 


N 


f 80 A 


dp 


Jt.v 


(1.62) 


For an irreversible process, acting exactly as we have done with other potentials, it is 
straightforward to prove that in the conditions of fixed T, V, and p, dO/dt < 0, so that system’s 
equilibrium indeed corresponds to the minimum of the grand potential Q. 


1.6. Thermal machines 

In order to complete this brief review of thermodynamics, I cannot pass the topic of thermal 
machines - not because it will be used much in this course, but mostly because of its practical and 
historic significance. (Indeed, the whole field of thermodynamics was spurred by the famous 1824 work 
by S. Carnot, which in particular gave an alternative, indirect form of the 2 nd law of thennodynamics - 
see below.) 
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Figure 8a shows the generic scheme of a thermal machine that may perform mechanical work on 
the environment (in the notation of Eq. (1), equal to during each cycle of the expansion/compression 
of the “working gas”, by transferring different amounts of heat from a high-temperature heat bath ( Q H ) 
and to the low-temperature bath ( Q L ). One relation between three amounts Qh, Ql, and & is 
immediately given by the energy conservation (i.e. by “the 1 st law of thermodynamics”): 

Qh-Ql= 0-63) 

From Eq. (1), the mechanical work during the cycle may be calculated as 

-'& = fPdV, (1.64) 

i.e. equals the area circumvented by the representing point on the \P, V\ plane - see Fig. 8b. 36 Hence, the 
work depends on the exact form of the cycle, which in turn depends not only on T H and T L , but also on 
working gas’ properties. 



Fig. 1.8. (a) The simplest implementation of a thermal machine, and (b) the graphic presentation of the 
mechanical work it performs. On panel (b), solid arrow indicates the heat engine cycle direction, while 
the dashed arrow, the refrigerator cycle direction. 


An exception from this rule is the famous Carnot cycle, consisting of two isothennal and two 
adiabatic processes (all reversible!). In its heat engine’s fonn, the cycle starts from an isothermic 
expansion of the gas in contact with the hot bath (i.e. at T = Th), followed by its additional adiabatic 
expansion until T drops to If. Then an isothermal compression of the gas is performed in its contact 
with the cold bath (at 1= T L ), followed by its additional adabatic compression to raise its temberature to 
Th again, after which the cycle is repeated again and again. (Note that during this cycle the working gas 
is never in contact with both heat baths simultaneously, thus avoiding the irreversible heat transfer 
between them.) The cycle’s shape on the [ V , P ] plane depends on exact properties of the working gas 
and may be rather complicated. However, since the entropy is constant at any adabatic process, the 
Carnot cycle shape on the [.S', 7] plane is always rectangular - see Fig. 9. 37 


36 Note that positive sign of the circular integral corresponds to the clockwise rotation of the point, so that work (- 

done by the working gas is positive at the clockwise rotation (pertinent to heat engines ) and negative in the 
opposite case (implemented in refrigerators and heat pumps). 

37 A cycle with an [.S’, 7] shape very close to the Carnot (rectangular) one may be implemented at the already 
mentioned magnetic (or “adiabatic-demagnetization”) refrigeration, using the alignment of either atomic or 
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(b) 



Fig. 1.9. Representation of the 
Carnot cycle (a) on the [5, 7] 
plane and (b) the \V, P] plane 
(schematically). The meaning of 
arrows is the same as in Fig. 8. 


Since during each isotherm, the working gas is brought into thermal contact only with the 
corresponding heat bath, Eq. (19), dQ = TdS may be immediately integrated to yield 

Q h =T h (S 2 -S 1 ), Q l =T l (S 2 -S 1 ). (1.65) 


Hence the ratio of these two heat flows is completely determined by their temperature ratio: 

Qh _ Th 
Ql t l ’ 


( 1 . 66 ) 


regardless of the working gas properties. Equations (63) and (66) are sufficient to find the ratio of work 
-PJ to any of Qh and Ql. For example, the main figure-of-merit of a thermal machine used as a heat 

engine (< Qh > 0, Ql > 0, - ^ > 0), is its efficiency 

Heat 
engine 
efficiency: 
definition 

For the Carnot cycle, Eq. (66) immediately yields the famous relation, 

Carnot 
cycle’s 
efficiency 

which shows that at given Tl (that is typically the ambient temperature -300 K), the efficiency may be 
increased, ultimately to 1, by raising temperature of the heat source. 

On the other hand, if the cycle is reversed (see the dashed arrows in Figs. 8 and 9), the same 
thermal machine may serve as a refrigerator, providing the heat removal from the low-temperature bath 
(! Ql < 0) for the cost of consuming external work: > 0. This reversal does not affect the basic relation 

(63) that may be used to calculate the relevant figure-of-merit, called the cooling coefficient of 
performance (COP coo iing) 



( 1 . 68 ) 



(1.67) 


COP 


cooling 



Ql 

Qh-Ql ‘ 


(1.69) 


Notice that this coefficient may readily be above unity; in particular, for the Carnot cycle we may use 
Eq. (66) (which is also unaffected by the cycle reversal) to get 


nuclear spins by external magnetic field. In such refrigerators (to be further discussed in the next chapter), the role 
of the {-P, Vj pair of variables is played by the B } pair - see Eq. (3). 
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(COP coolins ) 


cooling / Camot 


t h -t l 


(1.70) 


so that the COP CO oiing is larger than 1 at 77/ < 277, and even may be very large when the temperature 
difference ( T H - 77), sustained by the refrigerator, tends to zero. For example, in a typical air- 
conditioning system, T H - 77 ~ 77/30, so that the Camot value of COP CO oiing is as high as ~30, while in 
the state-of-the-art commercial HVAC systems it is the range for 3 to 4. This is why the term “cooling 
efficiency”, used in some textbooks instead of (COP) coo ij ng , may be misleading. 

Since in the reversed cycle Qh = - + Qi. < 0, the system also provides heat flow into the hotter 

heat bath, and thus may be used as a heat pump. However, the figure-of-merit appropriate for this 
application is different: 


COP 


heating 



Qh 

Qh-Ql ’ 


(1.71) 


so that for the Carnot cycle 


(COP heating 


/ Camot 



(1.72) 


Note that this COP is always larger than 1 , meaning that the Carnot heat pump is always more 
efficient than the direct conversion of work into heat (where Qh = and COPheating = 1), though 
practical electricity-driven heat pumps are substantially more complex (and hence more expensive) than, 
say, simple electric heaters. Such heat pumps, with typical COPheating values around 4 in summer and 2 
in winter, are frequently used for heating large buildings. 

I have dwelled so long on the Carnot cycle, because it has a remarkable property: the highest 
possible efficiency of all heat-engine cycles. Indeed, in the Carnot cycle the transfer of heat between any 
heat bath and the working gas is performed reversibly, when their temperatures are equal. If this is not 
so, heat might flow from a hotter to colder system without performing any work. Hence the result (68) 
also yields the maximum efficiency of any heat engine. In particular, it shows that r/ max = 0 at T H = 77, 
i.e., no heat engine can perfonn any mechanical work in the absence of temperature gradients. 38 In some 
alternative axiomatic systems of thermodynamics, this fact, i.e. the impossibility of the direct conversion 
of heat to work, is postulated, and serves the role of the 2 nd law. 

Note also that according to Eq. (71), COP CO oiing of the Carnot cycle tends to zero at 77 — > 0, 
making it impossible to reach the absolute zero of temperature, and hence illustrating the meaningful 
(Nemsf s) fonnulation of the 3 ld law of thermodynamics. Indeed, let us prescribe a certain (but very 
large) heat capacity C( T) to the low-temperature bath, and use the definition of this variable to write the 
following evident expression for the (very small) change of its temperature as a result of a relatively 
number dn of similar refrigeration cycles: 

C(T L )dT L =Q L dn. (1.73) 


38 Such a hypothetical (and impossible!) heat engine, which would violate the 2 nd law of thermodynamics, is 
called the “perpetual motion machine of the 2 nd kind” - in contrast to the “perpetual motion machine of the 1 st 
kind” with would violate the 1 st law, i.e., the energy conservation. 
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Together with Eq. (66), this relation yields 

C(T L )dT L = -@A Tl in, 

* H 


(1.74) 


so that if we perfonn many (n) cycles (with constant Qh and 7)/), the initial and final values of obey 
the following equation 


T f C(T)dT | Q h 

= n . 

J T T 

T..i l h 


(1.75) 


For example, if C(T) is a constant, Eq. (75) yields an exponential law, 


ex Pl - 


Qh_ 

CT„ 


(1.76) 


with the absolute zero not reached as any finite n. Relation (75) proves the Nernst theorem if C( T) does 
not vanish at T — > 0,but for such metastable systems as glasses the situation is more complicated. 39 
Fortunately, this issue does not affect other aspects of statistical physics - at least those to be discussed 
in this course. 


1.7. Exercise problems 

1.1 . Two bodies, with negligible thermal expansion coefficients and constant heat capacities C i 
and Ci, are placed into a weak thennal contact, at different initial temperatures T\ and 77. Calculate the 
full change of entropy of the system before it reaches the full thermal equilibrium. 

1.2 . A gas has the following properties: 

(i) Cv = aT b , and 

(ii) the work W'j needed for its isothermal compression from Vi to V\ equals c71n( Vf V\), 

where a, b, and c are constants. Find the equation of state of the gas, and calculate the temperature 
dependences of its entropy S, and thermodynamic potentials E, 77, F, G and Q. 

1.3 . A vessel with an ideal classical gas of indistinguishable molecules is separated by a partition 
so that the number N of molecules in both parts is the same but their volumes are different. The gas is in 
thennal equilibrium, and its pressure in one part is Pi, and in another, Pi. Calculate the change of 
entropy caused by a fast removal of the partition. Analyze the result. 

1.4 . An ideal classical gas of N particles, is initially confined to volume V, and let alone to reach 
the thermal equilibrium with a heat bath of temperature T. Then the gas is allowed to expand to volume 
V’> V in one the following ways: 


39 For a detailed discussion of this issue see, e.g., J. Wilks, The Third Law of Thermodynamics, Oxford U. Press, 
1961. 
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(i) The expansion slow, so that due to the thermal contact with the heat bath, the gas temperature 
remains equal to T all the time. 

(ii) The partition separating volumes V and (V’ -V) is removed very fast, allowing the gas to 
expand rapidly. 

For each process, calculate the changes of pressure, temperature, energy, and entropy of the gas 
during its expansion. 

1.5 . For an ideal classical gas with temperature-independent specific heat, derive the relation 
between P and V at the adiabatic expansion/compression. 

1.6 . Calculate the speed and wave impedance of acoustic waves in an ideal classical gas with 
temperature-independent Cp and Cy, in the limits when that the wave propagation may be treated as: 

(i) an isothermal process, 

(ii) an adiabatic process. 

Which of these limits corresponds to higher wave frequencies? 


1.7 . As will be discussed in Sec. 3.5, the so-called “hardball” models of classical particle 
interaction yield the following equation of state of a gas of such particles: 

P = T<p{n), 

where n = N/V is the particle density, and function (pin) is generally different from that (r/yt ca i(«) = n) of 
the ideal gas. For such a gas, with temperature-independent cy, calculate: 

(i) the energy of the gas, and 

(ii) its pressure as a function of n at adiabatic compression. 


1.8 . For an arbitrary thermodynamic system with a fixed number of particles, prove the 
following Maxwell relations (already mentioned in Sec. 1.4): 


fas) 

f dF ) t-\ 

f dV ) 

far) 

(Hi): 

fdS ) 1 

fdV ' 

(iv): 
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1.9 . A process, performed with a fixed portion of an ideal gas, may be 
represented with a straight line on the [ P , V\ plane - see Fig. on the right. Find 
the point at which the heat flow into/out of the gas changes its direction. 



1.10 . Two bodies have equal and constant heat capacities C, but different temperatures, T\ and 
T 2 . Calculate the maximum mechanical work obtainable from this system, using a heat engine. 
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1.11 . Express the efficiency of a heat engine that uses the 
“Joule cycle” consisting of two adiabatic and two isobaric processes 
(see Fig. on the right), via the minimum and maximum values of 
pressure, and compare the result with that for the Carnot cycle. 
Assume an ideal classical working gas with constant Cp and Cv ■ 



1.12 . Calculate the efficiency of a heat engine using the “Otto 
cycle”, 40 which consists of two adiabatic and two isochoric (constant- 
volume) processes - see Fig. on the right. Explore how does the 
efficiency depend on the ratio r = V mL JV mm , and compare it with 
Carnot cycle’s efficiency. Assume an ideal working gas with 
temperature-independent specific heat. 



1.13 . A heat engine’s cycle consists of two isothermal (T = 
const) and two isochoric (V= const) reversible processes - see Fig. on 
the right. 

(i) Assuming that the working gas is an ideal classical gas of N 
particles, calculate the mechanical work performed by the engine 
during one cycle. 

(ii) Are the specified conditions sufficient to calculate engine’s 
efficiency? 



OF, V 2 V 


40 This name stems from the fact that the cycle is an approximate model of operation of the four-stroke internal- 
combustion engine, which was improved and made practicable (though not invented!) by N. Otto in 1876. 
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Probability 


Chapter 2. Principles of Physical Statistics 

This chapter is the key part of this course. It is started with a brief discussion of such basic notions of 
statistical physics as statistical ensembles, probability, and ergodicity. Then the so-called 
microcanonical distribution postulate is formulated in parallel with the statistical definition of entropy. 
The next step is the derivation of the Gibbs distribution, which if frequently considered the summit of the 
statistical physics, and one more, grand canonical distribution, which is more convenient for some tasks 
- in particular for the derivation of the Boltzmann, Fermi-Dirac, and Bose-Einstein statistics for systems 
of independent particles. 


2.1. Statistical ensembles and probability 

As has been already discussed in Sec. 1.1, statistical physics deals with systems in conditions 
when either the u nkn own initial conditions, or the system complexity, or the laws of motion (as in the 
case of quantum mechanics) do not allow a definite prediction of measurement results. The main 
formalism for the analysis of such systems is the probability theory, so let me start with a very brief 
review of its basic concepts using informal “physical” language - less rigorous but (hopefully) more 
transparent than a standard mathematical treatment. 1 

Consider N » 1 independent similar experiments carried out with apparently similar systems 
(i.e. systems with identical macroscopic parameters such as volume, pressure, etc.), but still giving, by 
any of the reasons outlined above, different results of measurements. Such a collection of experiments, 
together with the fixed method of result processing, is a good example of a statistical ensemble. Let us 
start from the case when the experiments may have M different discrete outcomes, and the number of 
experiments giving the corresponding different results is N\, N 2 ,..., Nm, so that 

M 

2 N *= N - (2-1) 
m = 1 


The probability of each outcome, for the given statistical ensemble, is then defined as 


W 


= lim 


./V— >co 


N ‘ 


( 2 . 2 ) 


Though this definition is so close to our everyday experience that it is almost self-evident, a few remarks 
may still be relevant. 


First, probabilities W„, depend on the exact statistical ensemble they are defined for, notably 
including the method of result processing. As an example, consider the standard coin tossing. For the 
ensemble of all tossed coins, the probabilities of both the heads and tails outcomes equal 14. However, 
nothing prevents us from defining another statistical ensemble as a set of coin-tossing experiments with 
the heads-up outcome. Evidently, the probability of finding coins with tails up in this new ensemble is 
not 14 but 0. Still, this set of experiments is not only legitimate but also a rather meaningful statistical 


1 For the reader interested in reviewing a more rigorous approach, I can recommend, for example, Chapter 1 8 of 
the handbook by G. Kom and T. Kom - see MA Sec. 16(ii). 
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ensemble; for example, the exact position and orientation of the tossed coins on the floor, within this 
restricted ensemble, may be rather random. 

Second, a statistical ensemble does not necessarily require N different physical systems, e.g., N 
different coins. It is intuitively clear that tossing the same coin N times constitutes an ensemble with 
similar statistical properties. More generally, a set of N experiments with the same system provides a 
statistical ensemble equivalent to the set of experiments with N different systems, provided that the 
experiments are kept independent, i.e. that outcomes of past experiments do not affect those of the 
experiments to follow. Moreover, for most physical systems of interest any special preparation is 
unnecessary, and N different experiments, separated by sufficiently long time intervals, fonn a “good” 
statistical ensemble - the property called ergodicity. 2 

Third, the reference to infinite N in Eq. (2) does not strip the notion of probability from its 
practical relevance. Indeed, it is easy to prove (see Chapter 5) that, at very general conditions, at finite 
but sufficiently large N, numbers N m are approaching their average (or expectation ) values 3 

{N„) = W m N, (2.3) 

1/9 

with the relative deviation scale decreasing as 1 / (N m ) “. 

Now let me list those properties of probabilities that we will immediately need. First, dividing 
Eq. (1 ) by N and following the limit M — » oo, we get the well-known normalization condition 

M 

2X=1; (2.4) 

m = 1 

just remember that it is true only if each experiment definitely yields one of outcomes N\, A^,. . ., Nm- 
Next, if we have an additive function of results, 

1 M 

/— 2X/„, (2.5) 

A m= l 

where f m are some definite (deterministic) coefficients, we may define the statistical average (also called 
the expectation value) of the function as 

i M 

(/)-lim»^T7E(V.)/„> (2.6) 

m = 1 


2 The most popular counter-example of a non-ergodic system is an energy-conserving system of particles placed 
in a potential which is a quadratic form of particle coordinates. Theory of oscillations tells us (see, e.g., CM Sec. 
5.2) that this system is equivalent to a set of non-interacting harmonic oscillators. Each of these oscillators 
conserves its own initial energy Ej forever, so that the statistics of N measurements of one such system may differ 
from that of N different systems with random distribution of Ej, even if the total energy of the system, E = 'LjEj, is 
the same. Such non-ergodicity, however, is a rather feeble phenomenon, and is readily destroyed by any of 
“mixing” mechanisms, such as weak interaction with environment (leading, in particular, to oscillation damping), 
nonlinear interaction of the components (see, e.g., CM Ch. 4), and chaos (CM Ch. 9), all of them strongly 
enhanced by increasing the number of particles in the system, i.e. the number of its degrees of freedom. This is 
why most real-life systems are ergodic; for those interested in non-ergodic exotics, I can recommend the 
monograph by V. Arnold and A. Avez, Ergodic Problems of Classical Mechanics, Addison-Wesley, 1989. 

3 Here, and everywhere in these notes, angle brackets (...) mean averaging over a statistical ensemble, which is 
generally different from averaging over time - as it will be the case in quite a few examples considered below. 
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Expectation 
value via 
probabilities 


Expectation 
value via 
probability 
distribution 


so that using Eq. (3) we get 

M 

(/)=2X/„- 

m=\ 

Notice that Eq. (3) may be considered as the particular form of this general result, for all f m = 1 . 


(2.7) 


Next, the spectrum of possible experimental outcomes is frequently continuous. (Think, for 
example, about the positions of the marks left by bullets fired into a target from a far.) The above 
formulas may be readily generalized to this case; let us start from the simplest situation when all 
different outcomes may be described by one continuous variable q, which replaces the discrete index m 
in Eqs. (l)-(7). The basic relation for this case is the self-evident fact that the probability dW of having 
an outcome within a very small interval dq near point q is proportional to the magnitude of that interval: 


dW = w(q)dq. 


(2.8) 


Function w(q), which does not depend on dq, is called the probability density. Now all the above 
formulas may be recast by replacing probabilities W m by products (8), and the summation over m, by 
integration over q. In particular, instead of Eq. (4) the normalization condition now becomes 

j* w{q)dq = 1, (2.9) 


where the integration should be extended over the whole range of possible values of q. Similarly, instead 
by Eq. (5), it is natural to consider a function f{q). Then instead of Eq. (7), the expectation value of the 
function may be calculated as 


(/) = J w(q)f(q)dq. 


( 2 . 10 ) 


It is straightforward to generalize these formulas to the case of more variables. For example, results of 
measurements of a particle with 3 degrees of freedom may be described by the probability density w 
defined in the 6D space of its generalized radius-vector q and momentum p. As a result, the expectation 
value of a function of these variables may be expressed as a 6D integral 


(/) = J w(q,p)/(q,p)d W 3 P- 


( 2 . 11 ) 


Some systems considered in this course consist of components whose quantum properties 
cannot be ignored, so let us discuss how (J) should be calculated in this case. If by f m we mean 
measurement results, Eq. (7) (and its generalizations) of course remains valid, but since these numbers 
themselves may be affected by the intrinsic quantum-mechanical uncertainty, it may make sense to have 
a bit deeper look into this situation. Quantum mechanics tells us 4 that the most general expression for 
the expectation value of an observable/ in a certain ensemble of macroscopically similar systems is 

(/>=Z<r,»-/,*=Tr(Wf). (2.12) 

m,m' 


Here /„„,■ are the matrix elements of the quantum-mechanical operator / corresponding to the 
observable/, in a full basis of orthonormal states m, 

( 2 - 13 ) 


4 See, e.g., QM Sec. 6.1. 
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while coefficients W mm - are elements of the so-called density matrix W, which represents, in the same 
basis, a density operator IF describing properties of this ensemble. Equation (12) is evidently more 
general than Eq. (7), and is reduced to it only if the density matrix is diagonal: 

W mm .=W m S mm „ (2.14) 

(where S mm ■ is the Kronecker symbol), when the diagonal elements W m play the role of probabilities of 
the corresponding states. 

Thus the largest difference between the quantum and classical description is the presence, in Eq. 
(12), of the off-diagonal elements of the density matrix. They have largest values in the pure (also called 
“coherent”) ensemble, in which the state of the system may be described with state vectors, e.g., the ket- 
vector 

\ a ) = Tj a >n\ m )’ ( 2 ' 15 ) 

m 


where a m are some complex coefficients. In this simple case, the density matrix elements are merely 


W , = a a ,, 

mm m m ‘ 


(2.16) 


so that the off-diagonal elements are of the same order as the diagonal elements. For example, in the 
very important particular case of a two-level system, the pure-state density matrix is 


W = 


f * 

eq a x 
* 

yCJC-, Ct x 


a j a 2 
a,a 


(2.17) 


2 J 


so that the product of its off-diagonal components is as large as that of the diagonal components. In the 
most important basis of stationary states, i.e. eigenstates of system’s time-independent Hamiltonian, 
coefficients a m oscillate in time as 5 


(0 = (0) exp< -i~rt\= a m exp^ -i-^t + ip, 


h 


h 


(2.18) 


where E m are the corresponding eigenenergies, and cp m are constant phase shifts. This means that while 
the diagonal terms of the density matrix (16) remain constant, its off-diagonal components are 
oscillating functions of time: 

W mm . = a m ,a m = \a m ,a m \ exp j / E,n ^ E '"' f j exp{/(^„, - cp m )}. (2.19) 

Due to the extreme smallness of the Planck constant (on the human scale of things), a miniscule random 
perturbations of eigenenergies are equivalent to substantial random changes of the phase multiplier, so 
that the time average of any off-diagonal matrix element tends to zero. Moreover, even if our statistical 
ensemble consists of systems with exactly the same E m , but different values <p m (which are typically hard 
to control at the initial preparation of the system), the average values of all W mm - (with m ^ m ’) vanish 
again. 


5 Here I use the Schrodinger picture of quantum mechanics in which the matrix elements /„„ do not evolve in 
time. 
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This is why, besides some very special cases, typical statistical ensembles of quantum particles 
are far from being pure, and in most cases (certainly including the thennodynamic equilibrium), a good 
approximation for their description is given by the opposite limit of the so-called classical mixture in 
which all off-diagonal matrix elements of the density matrix equal zero, and its diagonal elements W mm 
are merely the probabilities W m of the corresponding eigenstates. In this case, for observables 
compatible with energy, Eq. (12) is reduced to Eq. (7), with f m being the eigenvalues of variable / 


2.2. Microcanonical ensemble and distribution 

Let us start with the discussion of physical statistics with the simplest, microcanonical statistical 
ensemble 6 that is defined a set of macroscopically similar closed (isolated) systems with virtually the 
same total energy E. Since in quantum mechanics the energy of a closed system is quantized, it is 
convenient to include into the ensemble all systems with energies E m within a narrow interval A E « E, 
that is nevertheless much larger than the average distance SE between the energy levels, so that the 
number M of different quantum states within interval A E is large, M » 1 . Such choice of A E is only 
possible if SE « E; however, the reader should not worry too much about this condition, because the 
most important applications of the microcanonical ensemble are for very large systems (or very high 
energies) when the energy spectrum is very dense. 7 



Fig. 2.1. Very schematic image of the microcanonical 
ensemble. (Actually, the ensemble deals with quantum 
states rather than energy levels. An energy level may be 
degenerate, i.e. correspond to several states.) 


Micro- 

canonical 

distribution 


This ensemble serves as the basis for the fonnulation of a postulate which is most frequently 
called the microcanonical distribution (or sometimes the “main statistical hypothesis”): in the 
thermodynamic equilibrium, all possible states of the microcanonical ensemble have equal probability, 


W„, 


1 

— = const. 
M 


( 2 . 20 ) 


Though in some constructs of statistical mechanics this equality is derived from other axioms, which 
look more plausible to their authors, I believe that Eq. (20) may be taken as the starting point of the 
statistical physics, supported “just” by the compliance of all its corollaries with experimental 
observations. 8 


Note that postulate (20) sheds a light on the nature of the macroscopic irreversibility of 
microscopically reversible (closed) systems: if such a system was initially in a certain state, its time 


6 The terms “microcanonical”, as well as “canonical” (see Sec. 4 below) are apparently due to J. Gibbs, and I 
could not find out his motivation for these names. (“Canonical” in the sense of “standard” or “common” is quite 
appropriate, but why “micro”?) 

7 Formally, the main result of this section, Eq. (20), is valid for any M (including M = 1), it is just less 
informative for small M - and trivial for M = 1 . 

8 Though I have to move on, let me note that the microcanonical distribution (20) is a very nontrivial postulate, 
and my advice to the reader to give some thought to this foundation of the whole building of statistical mechanics. 
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evolution with just miniscule interactions with environment (which is necessary for reaching the 
thennodynamic equilibrium) would eventually lead to the uniform distribution of its probability among 
all states with the essentially same energy. Each of these states is not “better” than the initial one; rather, 
in a macroscopic system, there are just so many of these states that the chance to find the system in the 
initial state is practically nil - again, think about the ink drop diffusion into a glass of water. 

Now let us find a suitable definition of entropy .S of a microcanonical ensemble member - for 
now, in the thermodynamic equilibrium only. Since S' is a measure of disorder, it should be related to the 
amount of infonnation lost when the system goes from the full order to the full disorder, i.e. into the 
microcanonical distribution (20), or, in other words, the amount of information 9 necessary to find the 
exact state of your system in a microcanonical ensemble. 

In the infonnation theory, the amount of information necessary to make a definite choice 
between two options with equal probabilities (Fig. 2a) is defined as 

/( 2) = log, 2 = 1. (2.21) 

This unit of information is called a bit. Now, if we need to make a choice between 4 equally probable 
opportunities, it can be made in two similar steps (Fig. 2b), each requiring one bit of information, so that 
the total amount of information necessary for the choice is 

1(4) = 2/(2) = 2 = log 2 4. (2.22) 

An obvious extension of this process to the choice between M= 2'" states gives 

I(M) = ml( 2) = m = log 2 M. (2.23) 



This measure, if extended naturally to any 
entropy at equilibrium, with the only difference 
replaced with the natural one: 10 


(b) 


Fig. 2.2. “Logarithmic trees” of binary decisions 
for making a choice between (a) 2 and (b) 4 
opportunities with equal probabilities. 


integer M, is quite suitable for the definition of 
hat, following tradition, the binary logarithm is 


9 I will rely on reader’s common sense and intuitive understanding what information is, because in the formal 
information theory this notion is also essentially postulated - - see, e.g., the wonderfully clear text by J. Pierce, An 
Introduction to Infonnation Theory, Dover, 1980. 

10 This is of course just the change of a constant factor: S(M) = In M= ln2 x log 2 M= ln2 x I(M) ~ 0.693 I(M). A 
review of Chapter 1 shows that nothing in thermodynamics prevents us from choosing such coefficient arbitrarily, 
with the corresponding change of the temperature scale - see Eq. (1.9). In particular, in the SI units, Eq. (24b) 
becomes S = -k B lnW m , so that one bit of information corresponds to the entropy change AS = k B ln2 ~ 0.693 k B ~ 
0.965xl0' 23 J/K. By the way, formula “S = k log IF” is engraved on the tombstone of L. Boltzmann (1844-1906) 
who was the first one to recognize this intimate connection between the entropy and probability. 
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S = In M. 


(2.24a) 


Using Eq. (20), we may recast this definition in the most frequently used form 

Entropy 
in 

equilibrium 


(Again, please note that Eq. (24) is valid in the thennodynamic equilibrium only!) 



(2.24b) 


Equation (24) satisfies the major condition for the entropy definition in thennodynamics, i.e. to 
be a unique characteristics of disorder. Indeed, according to Eq. (20), number M (and hence any function 
of M) are the only possible measures characterizing the microcanonical distribution. We also need this 
function of M to satisfy another requirement to the entropy, of being an extensive thermodynamic 
variable, and Eq. (24) does satisfy this requirement as well. Indeed, mathematics says that for two 
independent systems the joint probability is just a product of their partial probabilities, and hence, 
according to Eq. (24b), their entropies just add up. 

Now let us see whether Eqs. (20) and (24) are compatible with the 2 nd law of thermodynamics. 
For that, we need to generalize Eq. (24) for S to an arbitrary state of the system (generally, out of 
thermodynamic equilibrium), with arbitrary state probabilities W m . For that, let us first recognize that M 
in Eq. (24) is just the number of possible ways to commit a particular system to a certain state n (n = 1, 
2 in a statistical ensemble where each state is equally probable. Now let us consider a more 
general ensemble, still consisting of a large number N » 1 of similar systems, but with a certain number 
N m = W m N » 1 of systems in each of M states, with W m not necessarily equal. In this case the evident 
generalization of Eq. (24) is that the entropy Sn of the whole ensemble is 


S N =\nM(N l ,N 2 ,..) , 


(2.25) 


where M {N\,Ni,...) is the number of ways to commit a particular system to a certain state n, while 
keeping all numbers N n fixed. Such number M (A|,A 2 ,...) is clearly equal to the number of ways to 
distribute N distinct balls between M different boxes, with the fixed number N m of balls in each box, but 
in no particular order within it. Comparing this description with the definition of the so-called 
multinomial coefficients , u we get 


M(N V N 2 ,...) = n c 


m 

N v N 2 ,...,N m - n x \n 2 \...n m \ 


1V1 

, with /V = ^ N m . 


(2.26) 


In order to simplify the resulting expression for Sn, we can use the famous Stirling formula in its 
crudest, de Moivre’s form 12 whose accuracy is suitable for most purposes of statistical physics: 

ln(A!)|^ oo ^A(lnA-l). (2.27) 

When applied to our current problem, this gives the following average entropy per system, 13 


11 See, e.g., MA Eq. (2.3). Despite the intimidating name, Eq. (26) may be very simply derived. Indeed, N\ is just 
the number of all possible permutations of A balls, i.e. the ways to place them in certain positions - say, inside M 
boxes. Now in order to take into account that the particular order of the balls in each box is not important, that 
number should be divided by all numbers N„\ of possible permutations of balls within each box - that’s it. 

12 See, e.g., MA Eq. (2.10). 
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(2.28) 


TV 


and since this result is only valid in the limit N m — » oo anyway, we may use Eq. (2) to present it as 



(2.29) 


Entropy 
out of 
equilibrium 


This extremely important formula 14 may be interpreted as the average of the entropy values given by Eq. 
(24), weighed with specific probabilities W m in accordance with the general formula (7). 15 

Now let us find what distribution of probabilities W m provides the largest value of entropy (29). 
The answer is almost evident from a single glance at Eq. (29). For example, if coefficients W m are 
constant (and hence equal to 1/M’) for a subgroup of M’ < M states and equal zero for all others, all M’ 
nonvanishing terms in the sum (29) are equal to each other, so that 


5 = 



In AT = In A/ 7 , 


(2.30) 


so that the closer M’ to its maximum number M the larger S. Hence, the maximum of S is reached at the 
uniform distribution given by Eq. (24). 

In order to prove this important fact more strictly, let us find the maximum of function given by 
Eq. (29). If its arguments W\, W 2 , ...Wm were completely independent, this could be done by finding the 
point (in the M-dimensional space of coefficients W m ) where all partial derivatives 8S/dW m are equal to 
zero. However, since the probabilities are constrained by condition (4), the differentiation has to be 
carried out more carefully, taking into account this interdependence: 


8 

dW m 

m 




- cond 


8S 

8W m 


+ 1 


dS dW m , 
8W. dW m ' 

m m 


(2.31) 


At the maximum of function S, all such expressions should be equal to zero simultaneously. This 
condition may be presented as 8S/dW m = X, where the so-called Lagrange multiplier X is independent of 
m. Indeed, at such point Eq. (31) becomes 


8 

^ 8W , 


few ^8w.) 

— S {W X ,W 2 ,...) 
[_dW m 

= x+yx m 

■ 8W 

cond m * m « 

= x 

1 8W m ±8W m J 


= X (1) = 0. (2.32) 

dW„ 


13 Strictly speaking, I should use notation (S) here. However, following the style accepted in thermodynamics, I 
will drop the averaging sign until we will really need them to avoid confusion. Again, this shorthand is not too 
bad because the relative fluctuations of entropy (as those of any macroscopic variable) are very small at N» 1. 

14 With the replacement of In W m for log 2 W m (i.e. division by ln2), Eq. (29) is famous as the Shannon (or 
“Boltzmann-Shannon”) formula for average information / per symbol in a long communication string using M 
different symbols, with probability W m each. 

15 In some textbooks, this simple argument is even accepted as the derivation of Eq. (29); however, it is evidently 
less strict than the one outlined above. 
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For the particular expression (29), condition dS/8W,„ = A, yields 


8S 

dW m 


cl 

dW m 


\-W m In w ] 

L m m J 


In W m -1 = A. 

m 


(2.33) 


Equation (33) may hold for all m (and hence the entropy reach its maximum value) only if W m is 
independent on m. Thus entropy (29) indeed reaches its maximum value (24) at equilibrium. 

To summarize, we see that definition (24) of entropy in statistical physics does fit all the 
requirements imposed on this variable by thermodynamics. 16 In particular, we have been able to prove 
the 2 nd law of thermodynamics, starting from that definition and a more fundamental postulate (20). 
Now let me discuss one possible point of discomfort with that definition: it depends on the accepted 
energy interval of the microcanonical ensemble, for whose width A E no exact guidance is offered. 
However, if the interval A E contains many states, M » 1, then with a very small relative error 
(vanishing in the limit M— > qo), Mmay be presented as 

M = g(E)AE, (2.34) 


where g(E) is the density of states of the system: 


g(E) = 


dL(E) 
dE ’ 


(2.35) 


E(fs) being the total number of states with energies below E. (Note that the average interval 5E between 
energy levels, mentioned in the beginning of this section, is just 8E = AE/M = 1 lg.) Plugging Eq. (34) 
into Eq. (24), we get 

S = InM = lng(fs) + In AE, (2.36) 

so that the only effect of a particular choice of A E is an offset of entropy by a constant, and in Chapter 1 
we have seen that such a shift does not affect any measurable quantity. Of course, Eq. (34), and hence 
Eq. (36) are only precise in the limit when density of states g(E) is so large that the range available for 
the appropriate choice of A E , 

g 1 (E)«AE«E, (2.37) 

is sufficiently broad: M = g(E)E = El 8E» 1 . 

In order to get some feeling of the functions g(E) and S(E) and the feasibility of condition (37), 
and also to see whether the microcanonical distribution may be directly used for calculations of 
thennodynamic variables in particular systems, let us apply it to a microcanonical ensemble of many 
sets of iV» 1 independent, similar hannonic oscillators with eigenfrequency co. (Please note that the 
requirement of a virtually fixed energy is applied, in this case, the total energy En of the set, rather to a 
single oscillator - whose energy E may be virtually arbitrary, though certainly less than En ~ NE.) Basic 
quantum mechanics 17 teaches us that the eigenenergies of such an oscillator form a discrete, equidistant 
spectrum: 


16 This is not to say that these definitions are fully equivalent. Despite all the wealth of quantitative relations 
given by thermodynamics, it still leaves a substantial uncertainty in the definition of entropy (and hence 
temperature), while Eq. (24) narrows this uncertainty to an unsubstantial constant. 

17 See, e.g., QM Secs. 2.10 and 5.4. 
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E 


m 


= hco 


f 1 

m + — 


V 2 


x 

J 


where m = 0 , 1 , 2 ,... 


(2.38) 


If co is kept constant, the zero-point energy hco 12 does not contribute to any thennodynamic properties of 
the system and may be ignored, 18 so that for the sake of simplicity we may take that point as the energy 
origin, and replace Eq. (38) with E m = mhco. Let us carry out an approximate analysis of the system for 
the case when its average energy per oscillator, 


E = 



(2.39) 


is much larger than the energy quantum hco. For one oscillator, the number of states with energy £\ 
below certain value = E\ » hco is evidently E(E i) ~ E\/hco (Fig. 3a). For two oscillators, all possible 
values of the total energy (ei + ei) below some level £3 correspond to the points of a 2D square grid 
within the right triangle shown in Fig. 3b, giving £(£3) ~ (1 IT)(Eilhco) . For three oscillators, the 
possible values of the total energy (c\ + 82 + £ 3 ) correspond to those points of the 3D cubic mesh, that fit 
inside the right pyramid shown in Fig. 3c, giving £(£)) ~ (l/2)(l/3)(£’ 3 /£m) 3 = (\l3\)(Efhco)\ etc. 



Fig. 2.3. Calculating functions I(E, V ) for the systems of (a) one, (b) two and (c) three quantum oscillators. 


An evident generalization of these formulas to arbitrary N gives the number of states 

{ T7 \ N 




_ N 

hco 


(2.40) 


where coefficient 1 IN\ has the geometrical meaning of the (hyper)volume of the /Y-dimensional right 
pyramid with unit sides. Differentiating Eq. (40), we get 


s(e n ) 


dZ(E N ) 

dE N 


-N - 1 


(iv-i)iM^ 


(2.41) 


so that 


18 Let me hope that the reader knows that the zero-point energy is experimentally measurable - for example using 
the famous Casimir effect - see, e.g., QM Sec. 9.1. In Sec. 5.6 below we will discuss another method of 
experimental observation of that energy. 
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S N (E N ) = In g(E N ) + const = -ln^iV-l^j + ^-^lni^ - N\n(hco) + const. (2.42) 

For N » 1 we can ignore the difference between N and (N — 1) in both instances, and use the Stirling 
formula (27) to simplify this result as 


S N (E) - const « N 


ln- 


Nhco 


- + 1 


N 


In — 
hco 


= In 


J 


\PlCO j 


(2.43) 


(The second approximation is only valid at very high Elhco ratios, when the logarithm in Eq. (43) is 
substantially larger than 1, i.e. is rather crude. 19 ) Returning for a second to the density of states, we see 
that in the limit N — > oo, it is exponentially large: 


g{E N ) = e*» 


\fico J 


(2.44) 


so that both conditions (37) may be satisfied within a very broad range of A E. 

Now we can use Eq. (43) to find all thermodynamic properties of the system, though only in the 
limit E » Tico. Indeed, according to thermodynamics (see Sec. 1.2), if the system volume and number of 
particles are fixed, the derivative dS/dE is nothing more than the reciprocal temperature - see Eqs. (1.9) 
or (1.15). In our current case, we imply that the harmonic oscillators are distinct, for example by their 
spatial positions. Flence, even if we can speak of some volume of the system, it is certainly fixed. 20 
Differentiating Eq. (43) over energy E, we get 

Average 
energy of a 
classical 
oscillator 

Reading this result backwards, we see that the average energy E of a harmonic oscillator equals T (i.e. 
k B T K is SI units). As we will show in Sec. 5 below, this is the correct asymptotic fonn of the exact result, 
valid in our current limit E»ha>. 



(2.45) 


Result (45) may be readily generalized. Indeed, in quantum mechanics a harmonic oscillator with 
eigenfrequency co may by described by Hamiltonian 



(2.46) 


where q is some generalized coordinate, and p the corresponding generalized momentum, in is 
oscillator’s mass, 21 and k is the spring constant, so that co = (him) 12 . Since in thermodynamic 
equilibrium the density matrix is always diagonal (see Sec. 1 above) in basis of stationary states m, 
quantum-mechanical averages of the kinetic and potential energies may be found from Eq. (7): 


19 Let me offer a very vivid example how slowly does the logarithm function grow at large values of its argument: 
In of the number of atoms in the visible Universe is less than 200. 

20 By the same reason, the notion of pressure P in such a system is not clearly defined, and neither are any 
thermodynamic potentials but E and F. 

21 Let me hope that using the same letter for the mass and the state number would not lead to reader’s confusion. I 
believe that the difference between these uses is very clear from the context. 
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(2.47) 


where W m is the probability to occupy 777 -th energy level, and bra- and ket-vectors describe the stationary 
state corresponding to that level. 22 However, both classical and quantum mechanics teach us that for any 
777 , the bra-kets under the sums in Eqs. (47), which present the average kinetic and mechanical energies 
of the oscillator on its 777 th energy level, are equal to each other, and hence each of them is equal to EJ2. 
Hence, even though we do not know the probability distribution W m yet (it will be calculated in Sec. 5 
below), we may conclude that in the “classical limit” T» hco, 


Equipartition 

theorem 


Now let us consider a system with an arbitrary number of degrees of freedom, described by a 
more general Hamiltonian: 23 

, p] k q 2 

H = Y,H., H. + ( 2 . 49 ) 

y ' 1 2 / 7 / 2 

with (generally, different) eigenfrequencies C0j = ( 79/777,- ) . Since the “modes” (effective harmonic 
oscillators), contributing into this Hamiltonian, are independent, result (48) is valid for each of the 
modes. This is the famous equipartition theorem : at thermal equilibrium with T » ha>j, the average 
energy of each so-called half-degree of freedom (which are defined as variables pj or qj, giving a 
quadratic term to the system’s Hamiltonian), is equal to 772. 24 In particular, for each Cartesian 
coordinate qj of a free-moving, non-interacting particle this theorem is valid for any temperature, 
because such coordinates may be considered as ID harmonic oscillators with vanishing potential energy, 
i.e. a>j = 0, so that condition T» hcoj is fulfilled at any temperature. 

At this point, a first-time student of thermodynamics should be very much relieved to see that the 
counter-intuitive thermodynamic definition (1.9) of temperature does indeed correspond to what we all 
have known about this notion from our kindergarten physics courses. 

I believe that our case study of quantum oscillator systems has been a fair illustration of both the 
strengths and weaknesses of the microcanonical ensemble approach. 25 On one hand, we could calculate 
virtually everything we wanted in the classical limit T » hco, but calculations for arbitrary T ~ hco, 
though possible, are difficult, because for that, all vertical steps of function 1,(E N ) have to be carefully 



22 Note again that though we have committed the energy E N of N oscillators to be fixed (in order to apply Eq. 
(36), valid only for a microcanonical ensemble at thermodynamic equilibrium), single oscillator’s energy E in our 
analysis may be arbitrary - within limits hco« E < E N ~ NT. 

23 As a reminder, the Hamiltonian of any system whose classical Lagrangian function is an arbitrary quadratic 
form its generalized coordinates and the corresponding generalized velocities, may be brought to form (49) by an 
appropriate choice of “normal coordinates” qj which are certain linear combinations of the original coordinates - 
see, e.g., CM Sec. 5.2. 

24 This also means that in the classical limit, the heat capacity of a system is equal to the number of its half- 
degrees of freedom (in SI units, multiplied by k B ). 

25 The reader is strongly urged to solve Exercise 2, whose task is to do a similar calculation for another key (“two- 
level”) physical system, and to compare the results. 


Chapter 2 


Page 12 of 44 


Essential Graduate Physics 


SM: Statistical Mechanics 


counted. In Sec. 4, we will see that other statistical ensembles are much more convenient for such 
calculations. 

Let me conclude this discussion of entropy with a short notice on deterministic classical systems 
with a few degrees of freedom (and even simpler mathematical objects called “maps”) that may exhibit 
essentially disordered behavior, called the deterministic chaos. 26 Such chaotic system may be 
approximately characterized by an entropy defined similarly to Eq. (29), where W m are probabilities to 
find it in different small regions of phase space, at well separated time intervals. On the other hand, one 
can use an equation slightly more general than Eq. (29) to define the so-called Kolmogorov (or 
“Kormogorov-Sinai”) entropy K that characterizes the speed of loss of infonnation about the initial state 
of the system, and hence what is called the “chaos’ depth”. In the definition of K, the sum over m is 
replaced with the summation over all possible permutations {m} = mo, mi, ..., m^-i of small space 
regions, and W m is replaced with W> m <, the probability of finding the system in the corresponding 
regions m at time moment t m , with t m = mz , in the limit r — » 0, with Nz = const. For chaos in the 
simplest objects, ID maps, K is equal to the Lyapunov exponent A > 0. 27 For systems of higher 
dimensionality, which are characterized by several Lyapunov exponents A, the Kolmogorov entropy is 
equal to the phase-space average of the sum of all positive A. These facts provide a much more 
practicable way of (typically, numerical) calculation of the Kolmogorov entropy than the direct use of 
its definition. 28 


2.3. Maxwell’s Demon, information, and computation 

Before proceeding to other statistical distributions, I would like to address one more popular 
concern about Eq. (24), the direct relation between the entropy and information. Some physicists are still 
uneasy with entropy being nothing else than the (deficit of) information, 29 though to the best of my 
knowledge, nobody has yet been able to suggest any experimentally verifiable difference between these 
two notions. Let me give one example of their direct relation, that is essentially a development of the 
thought experiment suggested by Maxwell as early as in 1867. 

Consider a volume containing just one molecule (considered as a point particle), and separated to 
two equal halves by a movable partition with a door that may be opened and closed at will, at no energy 
cost (Fig. 4a). If the door is open and the system is in thermodynamic equilibrium, we do not know on 
which side of the door partition the molecule is. Here the disorder (and hence entropy) are largest, and 
there is no way to get, from a large ensemble of such systems, any useful mechanical energy. 

Now, let us consider that we (as instructed by, in Lord Kelvin’s fonnulation, an omniscient 
Maxwell’s Demon) know which side of the partition the molecule is currently located. Then we may 


26 See, e.g., CM Chapter 9 and literature therein. 

27 For the definition of A, see, e.g., CM Eq. (9.9). 

28 For more discussion, see, e.g., either Sec. 6.2 of the monograph H. G. Schuster and W. Just, Deterministic 
Chaos, 4 th ed., Wiley-VHS, 2005, or the monograph by Arnold and Avez, cited in Sec. 1. 

29 While some of these concerns should be treated with due respect (because the ideas of entropy and disorder are 
indeed highly nontrivial), I have repeatedly run into rather shallow arguments which stemmed from arrogant 
contempt to the information theory as an “engineering discipline”, and unwillingness to accept the notion of 
information on the equal footing with those of space, time, and energy. Fortunately, most leading physicists are 
much more flexible, and there are even opposite extremes such as J. A. Wheeler’s “it from bit” (i.e. matter from 
information) philosophy - to which I cannot frilly subscribe either. 
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close the door, so that molecule’s impacts on the partition create, on the average, a pressure force f 
directed toward the empty part of the volume (in Fig. 4b, the right one). Now we can get from the 
molecule some mechanical work, say by allowing force / to move the partition to the right, and picking 
up the resulting mechanical energy by some deterministic external mechanism. After the partition has 
been moved to the right end of the volume, we can open the door again (Fig. 4c), equalizing the 
molecule’s average pressure on both sides of the partition, and then slowly move the partition back to 
the middle of the volume, without doing any substantial work. With the kind help by Maxwell’s Demon, 
we can repeat the cycle again and again, and hence make the system to do unlimited mechanical work, 
fed “only” by information and thermal motion, and thus implementing the perpetual motion machine of 
the 2 nd kind - see Sec. 1.6. The fact that such heat engines do not exist means that the Maxwell’s Demon 
does not either: getting any new infonnation, at nonvanishing temperature (i.e. at thermal agitation of 
particles) has a finite energy cost. 



(a) 



(b) 

(c) 

\ 1 


N 

f 

> 




Fig. 2.4. The Maxwell’s Demon paradox: the volume with a single molecule (a) before and (b) after 
closing the door, and (c) after opening the door in the end of the expansion stage. 


In order to evaluate this cost, let us calculate the maximum work per cycle made by the 
Maxwell’s heat engine (Fig. 4), assuming that it is constantly in thermal equilibrium with a heat bath of 
temperature T. Formula (21) tells us that the infonnation supplied by the demon (what exactly half of 
the volume contains the molecule) is exactly one bit, / (2) = 1 . According to Eq. (24), this means that by 
getting this information we are reducing entropy by AS) = -ln2. Now, it would be a mistake to plug this 
(negative) entropy change into Eq. (1.19). First, that relation is only valid for slow, reversible processes. 
Moreover (and more importantly), this equation, as well as its irreversible version (1.41), is only valid 
for a fixed statistical ensemble. The change AS/ does not belong to this category, and may be formally 
described by the change of the statistical ensemble - from the one consisting of all similar systems 
(experiments) with an unknown location of the molecule, to the new ensemble consisting of the systems 
with the molecule in its certain (in Fig. 4, left) half. 30 

Now let us consider the slow expansion of the “gas” after the door had been closed. At this stage, 
we do not need the demon’s help any longer (i.e. the statistical ensemble is fixed), and we can use 
relation (1.19). At the assumed isothermal conditions (T = const), this relation may be integrated over 
the whole expansion process, getting A Q = TAS. At the finite position, the system’s entropy should be 
the same as initially, i.e. before the door had been opened, because we again do not know where in the 
volume the molecule is. This means that the entropy was replenished, during the reversible expansion, 


30 This procedure of redefining the statistical ensemble is the central point of the connection between the 
information theory and physics, and is crucial in particular for any (meaningful :-) discussion of measurements in 
quantum mechanics - see, e.g., QM Secs. 2.5 and 7.7. 
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from the heat bath, by AS = - AS) = +ln2, so that A Q = TAS = 71n2. Since by the end of the whole cycle 
the internal energy E of the system is the same as before, all this heat should have gone into the 
mechanical energy obtained during the expansion. Thus the obtained work per cycle (i.e. for each 
obtained information bit) is 71n2 (k B 7khi2 in SI units), about 4x10'" Joule at room temperature. This is 
exactly the minimum energy cost of getting one bit of new information about a system at temperature T. 

The smallness of that amount on the everyday human scale has left the Maxwell’s Demon 
paradox an academic exercise for almost a century. However, its discussion resumed in the 1960s in the 
context of energy consumption at numerical calculations, motivated by the exponential {Moore ' s-law ) 
progress of the digital integrated circuits, which leads in particular, to a fast reduction of energy A E 
“spent” (turned into heat) per one binary logic operation. In the current generations of semiconductor 
digital integrated circuits, A E is of the order of ~ 10' 16 J, 31 i.e. still exceeds the room-temperature value 
of 71n2 = k B 7kln2 « 3x10'” J by more than 4 orders of magnitude. Still, some engineers believe that 
thennodynamics imposes an important lower limit on A E and hence presents an insurmountable obstacle 
to the future progress of computation, 32 so that the issue deserves a discussion. 

Let me believe that the reader of these notes understands that, in contrast to nai've popular 
thinking, computers do not create any new information; all they can do it to reshape (process) it, loosing 
most of input infonnation on the go. Indeed, any digital computation algorithm may be decomposed into 
simple, binary logical operations, each of them perfonned by a certain logic circuit called the logic gate. 
Some of these gates (e.g., logical NOT performed by inverters, as well as memory READ and WRITE 
operations) do not change the amount of information in the computer. On the other hand, such 
information-irreversible logic gates as two-input NAND (or NOR, or XOR, etc.) actually erase one bit 
at each operation, because they turn two input bits into one output bit (Fig. 5a). 



(b) 


Fig. 2.5. Simple examples 
of (a) irreversible and (b) 
potentially reversible logic 
circuits. Each rectangle 
presents a circuit storing 
one bit of information. 


In 1961, R. Landauer arrived at the conclusion that each logic operation should turn into heat at 
least energy 


31 In the dominating CMOS technology /, A E is close to twice the energy CV 2 il of recharging the total capacitance 
C of the transistor gate electrodes and the wires interconnecting the gates, by the voltage V representing the binary 
unity. As the technology progresses, C decreases in approximate proportion with the minimum feature size, 
resulting in the almost proportional decrease of A E. (The used voltage V has almost saturated at ~1 V - the value 
that stems from the bandgap of ~1 eV of the used semiconductor - silicon.) 

32 Unfortunately, this delusion has resulted in a substantial and unjustified shift of electron device research 
resources toward using “non-charge degrees of freedom” (such as spin) - as if they do not obey the general laws 
of statistical physics! 
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AE mm =Tln2 = k B T K \n2. 


(2.51) 


This result may be illustrated with the Maxwell’s Demon machine shown in Fig. 4, operating as 
heat pump. At the first stage, with the door closed, it uses external mechanical work A E = 71n2 to reduce 
the volume in which of the molecule is confined from V to V/2, pumping heat -A Q = A E into the heat 
bath. To model a logically-irreversible logic gate, let us now open the door in the partition, and thus 
loose 1 bit of information about molecule’s position. Then we will never get work 7 1n 2 back, because 
moving the partition back to the right, with door open, takes place at zero average pressure. Hence, Eq. 
(51) gives a fundamental limit for energy loss (per bit) at the logically irreversible computation. 


Energy 
dissipation at 
irreversible 
computation 


However, in 1973 C. Bennett came up with convincing arguments that it is possible to avoid 
such energy loss by using only operations that are reversible not only physically, but also logically. 33 
For that, one has to avoid any loss of information, i.e. any erasure of intermediate results, for example in 
the way shown in Fig. 5b. (For that, gate F should be physically reversible, with no substantial static 
power consumption.) In the end of all calculations, after the result has been copied into a memory, the 
intermediate results may be “rolled back” through reversible gate to be eventually merged into a copy of 
input data, again without erasing a single bit. The minimal energy dissipation at such reversible 
calculation tends to zero as the operation speed is decreased, so that the average energy loss per bit may 
be less than the perceived “fundamental thermodynamic limit” (51). 34 The price to pay for this ultralow 
dissipation is an enonnous (exponential) complexity of hardware necessary for storage of all 
intermediate results. However, using irreversible gates sparely, it may be possible to reduce the 
complexity dramatically, so that in future the mostly reversible computation may be able to reduce 
energy consumption in practical digital electronics. 35 


Before we leave Maxwell’s Demon behind, let me use it to discuss, for one more time, the 
relation between the reversibility of the classical and quantum mechanics of Hamiltonian systems and 
the irreversibility possible in thermodynamics and statistical physics. In our (or rather Ford Kelvin’s :-) 
gedanken experiment shown in Fig. 4, the laws of mechanics governing the motion of the molecule are 
reversible all times. Still, at partition’s motion to the right, driven by molecule’s impacts, the entropy 
grows, because the molecule picks up heat A Q > 0, and hence entropy AS = A Q/T > 0, from the heat 
bath. The physical mechanism of this irreversible entropy (read: disorder) growth is the interaction of 
the molecule with uncontrollable components of the heat bath, and the resulting loss of information 
about the motion of the molecule. Philosophically, the emergence of irreversibility in large systems is a 
strong argument against the reductionism - a naive belief that knowing the exact laws of Nature at one 
level of its complexity, we can readily understand all the phenomena on the higher levels of its 
organization. In reality, the macroscopic irreversibility of large systems is a wonderful example of a new 
law (in this case, the 2 nd law of thermodynamics) that becomes relevant on the substantially new level of 
complexity - without defying the lower-level laws. Without such new laws, very little of the higher level 
organization of Nature may be understood. 


33 C. Bennett, IBM J. Res. Devel. 17 , 525 (1973); see also a later review C. Bennett, Int. J. Theor. Phys. 21 , 905 
(1982). To the best of my knowledge, the sub-71n2 energy loss per logic step is still to be demonstrated 
experimentally, but at least one research team is closing at this goal. 

34 Reversible computation may also overcome the perceived “fundamental quantum limit”, AEAt > h, where At is 
the time scale of the binary logic operation - see K. Likharev, Int. J. Theor. Phys. 21,311 (1982). 

35 The situation is rather different for quantum computation which may be considered as a specific type of 
reversible but analog computation - see, e.g., QM Sec. 8.5 and references therein. 
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2.4. Canonical ensemble and the Gibbs distribution 

As we have seen in Sec. 2, the microcanonical distribution may be directly used for solving some 
simple problems, 36 but a further development of this approach (also due to J. Gibbs) turns out to be 
much more convenient for calculations. Let us consider that a statistical ensemble of similar systems we 
are studying, each in thermal equilibrium with a much larger heat bath of temperature T (Fig. 6a). Such 
an ensemble is called canonical. 


system 
under study 
Em, T < 


dQ, dS 

> 


heat bath 
Ahb, T 


(a) 


* 




y a A s 


Ahb ~ As - E m 


E m 

0 




(b) 


Fig. 2.6. (a) System in a heat bath 
(a canonical ensemble member) 
and (b) energy spectrum of the 
composite system (including the 
heat bath). 


Next, it is intuitively evident that if the heat bath is sufficiently large, any thermodynamic 
variables characterizing the system under study should not depend on heat bath ’s environment. In 
particular, we may assume that the heat bath is thermally insulated; then the total energy As of the 
composite system (consisting of the system of our interest, plus the heat bath) does not change in time. 
For example, if our system of interest is on its certain (say, m th ) energy level, then 

A e = E m + A hb (2.52) 

is conserved. Now let us partition this canonical ensemble into much smaller sub-ensembles, each being 
a microcanonical ensemble of composite systems whose total energy Ay is the same - as discussed in 
Sec. 2, within a certain small energy interval AAv « £V. According to the microcanonical distribution, 
probabilities to find the composite system, within this new ensemble, in any state are equal. Still, heat 
bath energies Ahb = As - E m (Fig. 6b) of members of this microcanonical sub-ensemble may be different 
due to the difference in E,„. 

The probability W(E m ) to find the system of our interest (within the selected sub-ensemble) on 
some energy level E m is proportional to the number AM of such systems in the sub-ensemble. Due to the 
very large size of the heat bath in comparison with that of the system under study, the heat bath’ density 
of states gHB is very high, and AEV may be selected so that 

— « AA S « \E m - E m , | « A hb , (2.53) 

§ HB 

where m and m ’ are any states of the system of our interest. As Fig. 6b shows, in this case we may write 
AM= gHB(AHB)AAs. As a result, within the microcanonical ensemble with the total energy Ax, 


36 See also exercise problems listed in the end of this chapter. 
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W m oc AM = g HB (E HB )AE, = g HB (E, -E m )AE z . (2.54) 

Let us simplify this expression further, using the Taylor expansion with respect to relatively 
small E m « Ex. However, here we should be careful. As we have seen in Sec. 2, the density of states of 
a large system is an extremely rapidly growing function of energy, so that if we applied the Taylor 
expansion directly to Eq. (54), the Taylor series would converge for very small E m only. A much 
broader applicability range may be obtained by taking logarithm of both parts of Eq. (54) first: 

In W m = const + ln[g HB (£ x - E m )] + lnAEV = const + S HB [E, - E m ) , (2.55) 


where the second equality results from application of Eq. (36) to the heat bath, and In A/A has been 
incorporated into the constant. Now, we can Taylor-expand the (much more smooth) function of energy 
in the right-hand part, and limit ourselves to the two leading terms of the series: 


In W ... « const + S , 


E= 0 


dS_ 

dE 


HB 


E, =0 J 


HB 


(2.56) 


But according to Eq. (1.9), the derivative participating in this expression is nothing else than the 
reciprocal heat bath temperature that (due to the large bath size) does not depend on whether E m is equal 
to zero or not. Since our system of interest is in the thermal equilibrium with the bath, this is also the 
temperature T of the system - see Eq. (1.8). Hence Eq. (56) is merely 

In W m = const - . (2.57) 


This equation describes a substantial decrease of W m as E m is increased by several T, and hence our 
linear approximation (56) is virtually exact as soon as E H b is much larger than T - the condition that is 
rather easy to satisfy, because as we have seen in Sec. 2, the average energy of each particle is of the 
order of T. 


Now we should be careful again, because so far we have only derived Eq. (57) for a sub- 
ensemble with fixed Ex. However, since the right-hand part of Eq. (57) includes only E m and T that are 
independent of Ex, this relation is valid for all sub-ensembles of the canonical ensemble, and hence for 
the later ensemble as the whole. 37 Hence for the total probability to find our system of interest in state 
with energy E m , in the canonical ensemble with temperature T, we can write 


W m = const x cxp< 




(2.58) 


This is the famous Gibbs distribution (sometimes called the “canonical distribution”), 38 which is 
frequently arguably the summit of statistical physics, 39 because it may be used for a straightforward (or 
at least conceptually straightforward :-) calculation of all statistical and thermodynamic variables. 


Gibbs 

distribution 


37 Another way to arrive at the same conclusion is to note that the entropy of the whole canonical ensemble with 
fixed E m has to be a sum of entropies of its microcanonical sub-ensembles (with different Ex), which participate 
in Eq. (55). As a result, the logarithm of probability W m for our system of interest to have energy E,„ in the whole 
(canonical) ensemble is just a sum of Eqs. (57) for sub-ensembles with different Ex. 

38 The temperature dependence of the type exp{-E/T}, especially when showing up in rates of certain events, e.g., 
chemical reactions, is also frequently called the Arrhenius law - after chemist S. Arrhenius who has noticed this 
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Statistical 

sum 


F from Z 


Before I illustrate this, let me first calculate the coefficient Z participating in Eq. (58) for the 
general case. Requiring, in accordance with Eq. (4), the sum of all W m to be equal 1, we get 


z = X ex pj~ 


m 



(2.59) 


where the summation is formally extended to all quantum states of the system, though in practical 
calculations, the sum may be truncated to include only the states that are noticeably occupied. This 
apparently humble normalization coefficient Z turns out to be so important for the relation between the 
Gibbs distribution (i.e. statistics) and thermodynamics that it has a special name - or actually, two 
names: either the statistical sum or the partition function. To demonstrate how important Z is, let us use 
the general Eq. (29) for entropy to calculate its value for the particular case of the canonical ensemble, 
i.e. the Gibbs distribution of probabilities W„: 


s = - T.W m \nW m 

m 



(2.60) 


According to the general rule (7), the thermodynamic (i.e. ensemble-average) value E of the internal 
energy of the system is 


E = YW E 

m m 
m 



(2.61a) 


so that the second term in the right-hand part of Eq. (60) is just E/T, while the first term equals just InZ, 
due to the normalization condition (59). (As a parenthetic remark, using the notion of reciprocal 
temperature /3= 1 IT, Eq. (61a), with account of Eq. (59), may be also rewritten as 


o(lnZ) 

5/3 


(2.61b) 


This formula is very convenient for calculations if our prime interest is the average energy E rather than 
F or W n .) With these substitutions, Eq. (60) yields a very simple relation between the statistical sum and 
entropy: 

S = — + InZ . (2.62) 

T 


Using Eq. (1.33), we see that Eq. (62) gives a straightforward way to calculate the free energy F of the 
system from nothing else than its statistical sum: 


F = E-TS = T In — . 

Z 


(2.63) 


law in experimental data. In all cases I am aware of, the Gibbs distribution is the underlying reason of the 
Arrhenius law. 

39 This opinion is shared by several authoritative colleagues, including R. Feynman who climbs on this summit 
already by page 4 (!) of his brilliant book Statistical Mechanics, 2 nd ed., Westview, 1998. (Despite its title, this 
monograph a collection of lectures on a few diverse, mostly advanced topics of statistical physics, rather than its 
systematic course, so that unfortunately I cannot recommend it as a textbook.) 
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Now, using the general thennodynamic relations (see especially the circular diagram shown in 
Fig. 1.7b, and its discussion) we can calculate all thermodynamic potentials of the system, and all other 
variables of interest. Let me only note that in order to calculate pressure P - e.g., from the second of Eqs. 
(1.35) - we would need to know the explicit dependence of F, and hence of the statistical sum Z on the 
system volume V. This would require the calculation, by appropriate methods of either classical or 
quantum mechanics, of the volume dependence of eigenenergies E m . I will give numerous examples of 
such calculations later in the course. 40 


As the final note of this section, Eqs. (59) and (63) may be combined to give a very elegant 
expression, 



(2.64) 


which offers a convenient interpretation of free energy as a (rather specific) average of eigenenergies of 
the system. One more convenient formula may be obtained by using Eq. (64) to rewrite the Gibbs 
distribution (58) in the form 



F-E. 


(2.65) 


In particular, this expression shows that that since all probabilities W m are below 1, F is always 
lower than the lowest energy level. Also, note that probabilities W m do not depend on the energy 
reference choice, i. e. on an arbitrary constant added to all E m (and hence to E and F). 


2.5. Flannonic oscillator statistics 


The last property may be immediately used in our first example of the Gibbs distribution 
application to a particular, but very important system - the harmonic oscillator, for the more general case 
then was done in Sec. 2, namely for a “quantum oscillator” with an arbitrary relation between T and 
hco. 4] Let us consider a canonical ensemble of similar oscillators, each in a contact with a heat bath of 
temperature T. Selecting the zero-point energy heal 2 for the origin of E, oscillator eigenenergies (38) 
become E m = mfico (m = 0, 1,. . .), so that the Gibbs distribution for probabilities of these states is 


W_ = 




(2.66) 


40 In many multiparticle systems, the effect of an external field may be presented as a sum of its effects on each 
particle - frequently described by interaction energy with structure -fj q} k) , where q ik) is a generalized coordinate 
of k - th particle. Generally, this energy has to be included directly into energies of particle states E m , used in Z, and 
hence in the free energy F (63). In this case, the thermodynamic equilibrium corresponds to the minimum of F - 
see Eq. (1.42). On the other hand, for “linear” systems (whose energy is a quadratic -homogeneous form of its 
generalized coordinates and velocities), equivalent results may be obtained by accounting for the interaction at the 
thermodynamic level, i.e. by subtracting term f j (q, ) = -f ,N { qj k) ) from the free energy F calculated in the 
absence of the field, and then finding the equilibrium as a minimum of the resulting Gibbs energy G - see Eq. 
(1.43). In this case, any of the approaches is fine, provided only that the same interaction is not counted twice. 

41 A simpler task of making a similar calculation for another key quantum-mechanical object, the two-level 
system, is left for reader’s exercise. 
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with the statistical sum 


z = X cx pI 


mfico 


71 = 0 


00 


m=0 


where A 




< I . 


(2.67) 


Quantum 

oscillator’s 

statistics 


This series is just an infinite geometric progression (“geometric series”); summing it, 42 we get 


1-/1 i 

so that for the probability W m to find the oscillator at each energy level is 


( 2 . 68 ) 


W m = (l-e~ nco/T ^ e ~ mTlwlT _ 


(2.69) 


As Fig. 7a shows, the probability W m to find the oscillator in each particular state (but the ground 
one, with m = 0) vanishes in both low- and high-temperature limits, and reaches its maximum value W m 
~ 0.3/m at T ~ mfico, so that the contribution mficoW n of each level into the average oscillator energy E is 
always smaller than fico. 
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Fig. 2.7. Statistical and thermodynamic parameters of a harmonic oscillator, as functions of temperature. 


This average energy may be calculated in any of two ways: either using Eq. (7): 

oo / \ oo 

E = £ E m W m = (l - e~ n<olT )Y j mficoe ~ mhco/T , (2.70) 

m = 0 m = 0 

or (simpler) using Eq. (61b), as 


42 See, e.g., MA Eq. (2.8b). 
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E = 




P 


1 

T' 


(2.71) 


Both methods give (of course) the same famous result, 43 


E = E(co,T) = hco 


l_ 

e TuolT 



(2.72) 


which is valid for arbitrary temperature and plays the key role in many fundamental problems of 
physics. The red line in Fig. 7b shows this if as a function of normalized temperature. At low 
temperatures, T « hco, the oscillator is predominantly in its lowest (ground) state, and its energy (on top 
of the constant zero-point energy hcol 2!) is exponentially small: E « hco exp{ -hco/T} « T, hco. On the 
other hand, in the high-temperature limit the energy tends to T. This is exactly the result (a particular 
case of the equipartition theorem) that was obtained in Sec. 2 from the microcanonical distribution. 
Please note how much simpler is the calculation starting from the Gibbs distribution, even for an 
arbitrary ratio Tlhco. 


To complete the discussion of thermodynamic properties of the harmonic oscillator, we can 
calculate its free energy using Eq. (63): 


F = Tin — = T ln(l - e 
Z 




(2.73) 


Now entropy may be found from thermodynamics: either from the first of Eqs. (1.35), S = -{8FI8T) V , or 
(even more easily) from Eq. (1.33): S = (E -F)/T. Both relations give, of course, the same result: 


5 = 


hco 1 
T e ncolT -i 




(2.74) 


Finally, since in the general case the dependence of the oscillator properties (essentially, co) on volume 
V in this problem is not specified, such variables as P, ju , G, W, and Q are not defined, and we may 
calculate only the average heat capacity C per one oscillator: 


8E 

hco' 

2 e hcolT 

hcoUT 

8T 

{ T ) 

(e ha)/T -!) 2 

sinh (hco! 2 T) 


(2.75) 


The calculated thermodynamic variables are shown in Fig. 7b. In the low-temperature limit ( T 
« hco), they all tend to zero. On the other hand, in the high temperature limit (T » hco), F — » -T 
In (77/?rz>)— > -go , S' — > \n(T/hco) — » +oo, and C — » 1 (in SI units, C — > ko). Note that the last limit is the 
direct corollary of the equipartition theorem: each of two “half-degrees of freedom” of the oscillator 
gives, in the classical limit, a contribution C =Vi into its heat capacity. 

Now let us use Eq. (69) to discuss the statistics of the quantum oscillator described by 
Hamiltonian (46), in the coordinate representation. Again using density matrix’ diagonality at 
thermodynamic equilibrium, we may use a relation similar to Eqs. (47) to calculate the probability 
density to find the oscillator at coordinate q : 


43 It was first obtained in 1924 by S. Bose, and is frequently called the Bose distribution - a particular case of the 
Bose-Einstein distribution - to be discussed in Sec. 8 below. 


Quantum 

oscillator’s 
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oo oo / \ oo 

= = =\ l - e ~ n “' T )t e ~'" nr “ ,T kM 2 ■ ( 2 - 7 «) 

m = 0 m = 0 m = 0 

where y/ m {q) is the eigenfunction of m-th stationary state of the oscillator. Since each y/ m {q) is 
proportional to the Hermite polynomial 44 that requires at least m elementary functions for its 
representation, working out the sum in Eq. (76) is a bit tricky, 45 but the final result is rather simple: w(q) 
is just a normalized Gaussian distribution (the “bell curve”), 

with (q) = 0, and 

\ h , hco 

) = coth — . 

' 2m co 2 T 


(2.77) 

(2.78) 


Since coth 2 tends to 1 at £ — ► and diverges as Me at £ — > 0, Eq. (78) shows that the width of 

coordinate distribution is constant (and equal to that, h!2 mco, of the ground-state wavefunction t//o) at T 
« hco, and grows as Thnco at T/hco — > oo. 

As a sanity check, we may use Eq. (78) to write the following expression, 



hco , hco {hco/ 4, at T « hco, 
— coth > I 

4 2 T [772, at hco«T, 


(2.79) 


for the average potential energy of the oscillator. In order to comprehend this result, let us notice that 
Eq. (72) for the average full energy E was obtained by counting it from the ground state energy hco/2 of 
the oscillator. 46 If we add this energy to the result, we get 


Average 

energy 

including 

hcol2 


^ hco hco hco , hco 

E = ——— + = — coth — . 

e hco/T _ x 2 2 2 T 


(2.80) 


We see that for arbitrary temperature, ( U) = Et 2, as we already concluded from Eq. (47). This means 
that the average kinetic energy, equal to E - (U), is also the same: 




E hco , hco 

— = coth — . 

2 4 2 T 


(2.81) 


In the classical limit T» hco, both energies equal 772, reproducing the equipartition theorem result (48). 


2.6. Two important applications 

The results of the previous section, especially Eq. (72), have enumerable applications in physics, 
but I will have time for a brief discussion of only two of them. 


44 See, e.g., QM Sec. 2.10. 

45 The calculation may be found, e.g., in QM Sec. 7.2. 

46 As a quantum mechanics reminder, the ground state energy of the oscillator is not only measurable, but is also 
responsible for several important phenomena, e.g., the Casimir effect - see, e.g., QM Sec. 9. 1 . 
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(i) Blackbody radiation . Let us consider a free-space volume V limited by non-absorbing (i.e. 
ideally reflecting) walls. Electrodynamics tells us 47 that electromagnetic field in such a cavity may be 
presented as a sum of “modes” with time evolution similar to that of the usual harmonic oscillator, and 
quantum mechanics says 48 that the energy of such electromagnetic oscillator is quantized in accordance 
with Eq. (38), so that at thermal equilibrium the average energy is described by Eq. (72). If volume V is 
large enough, 49 the number of these modes within a small range dk of the wavevector magnitude k is 50 

dN = -^ T d 3 k = -^-4nk 2 dk, (2.82) 

(2 7if (2n ) 3 


where for electromagnetic waves, the degeneracy factor g = 2, due to their two different (e.g., linear) 
polarizations for the same wave vector k. With the isotropic dispersion relation for waves in vacuum, k 
= (ole, the elementary volume ak corresponding to a small interval dco is a spherical shell of small 
thickness dk = dco/c, and Eq. (82) yields 


dN = 


2V . co 2 dco Tr 

-An — — = V 

(2n) 3 c 3 


CO 

n 2 c 3 


dco. 


(2.83) 


Using Eq. (72), we see that the spectral density of electromagnetic wave energy, per unit volume, is 


u(co) = 


E dN 
V dco 


hco 3 1 

n 2 c e hcolT 


(2.84) 


This is the famous Planck’s blackbody radiation law. 51 To understand why its name mentions 
radiation, let us consider a small planar part, of area dA, of a surface that completely absorbs 
electromagnetic waves incident from any direction. (Such “perfect black body” approximation may be 
closely approached in special experimental structures, especially in limited frequency intervals.) Figure 
8 shows that if the arriving wave was planar, with the incidence angle 6, then power d/%co) absorbed 
by the surface within a small frequency interval dco (i.e. energy arriving at the surface within unit time 
interval), would be equal to the radiation energy within the same frequency interval and inside a 
cylinder of height c, base area dAcosO, and hence volume dV= c dAcosd : 


dP e (co) = u(co)dcodV = u(co)dco c dA cos 0 . 


(2.85) 


Since the thermally-induced field is isotropic, i.e. propagates equally in all directions, this results 
should be averaged over all solid angles within the polar angle interval 0 < 6< n/2: 


dP(co) 1 r dPg (co) , x 1 *f 2 . "r, „ C . x 

= — -dQ. = cu(co ) — smOdO \dcp cosO = — u(co) . (2.86) 

dAdoo 4 n J dAdco An * J 0 4 


47 See, e.g., EM Sec. 7.9. 

48 See, e.g., QM Sec. 9.1. 

49 In our current context, the volume should be much larger than ( cti/T ) 3 , where c « 3xl0 8 m/s is the speed of 
light. For room temperature (T ~ k B x300K « 4x10 21 J), that lower bound is of the order of 10’ 16 m 3 . 

50 See, e.g., EM Sec. 7.9, or QM Sec. 1.6. 

51 Let me hope the reader knows that the law was first suggested in 1900 by M. Planck as an empirical fit for the 
experimental data on blackbody radiation, and this was the historic point at which the Planck constant Ti (or rather 
h = 2 7th) was introduced - see, e.g., QM Sec. 1.1. 
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Hence the Plank’s fonnula (84), multiplied by c/4, gives the power absorbed by such 
“blackbody” surface. But at thermal equilibrium, this absorption has to be exactly balanced by the 
surface’s own radiation, due to its finite temperature T. 



Fig. 2.8. Calculating the relation between d'P(co) 
and u(co)da>. 


I am confident that the reader is familiar with the main features of the Planck law (84), including 
its general shape (Fig. 9), with the low-frequency asymptote u(a>) oc to (due to its historic significance 
bearing the special name of the Rayleigh- Jeans law), the exponential drop at high frequencies (the Wien 
law), and the resulting maximum of function u(a>), reached at frequency co max , 

hto nmx *2.28T, (2.87) 

i.e. at wavelength A max = 2n/k max = 2nd to max ~ 2.76 ch/T. Still, 1 cannot help mentioning two particular 
values corresponding to visible light (2 max ~ 500 nm) for Sun’s surface temperature T K « 6,000 K, and to 
mid-infrared range (/L max ~10 pm) for the Earth’s surface temperature 7k ~ 300 K. The balance of these 
two radiations, absorbed and emitted by the Earth, determines its surface temperature, and hence has the 
key importance for all life on our planet. As one more example, the cosmic microwave background 
(CMB) radiation, closely following the Planck law with 7k = 2.725 K (and hence having maximum 
density at 2 max « 1.9 mm), and in particular its weak anisotropy, is a major source of data for all modern 
cosmology. 52 


Fig. 2.9. Frequency dependence of the 
blackbody radiation density, normalized by u 0 = 
T i red c , according to the Planck law (red line) 
and the Rayleigh-Jeans law (blue line). 



52 For a recent popular book of this topic, see, e.g., S. Singh, Big Bang: The Origins of the Universe, 
HarperCollins, 2005. 
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Now let us calculate the total energy E of this radiation in some volume V. It may be found from 
Eq. (72) by integration its over all frequencies: 53 


E = V^u(ed)deo = vj 


Tied 


dco 


VT 


?dt; 


7t' C 3 


„ „2*3 3 I (5 

- 1 n n c J oe g 


= V 


-1 


71 

15/? 3 c 3 


( 2 . 88 ) 


(The last transition in Eq. (88) uses a table integral equal to T(4)£(4) = (3 !)(tt 4 /90) = 7r 4 /1 5. 54 ) Using Eq. 
(86) to recast Eq. (88) into the total power radiated by a blackbody surface, we get the well-known 
Stefan (or “Stefan-Boltzmann”) law 


dE _ n 2 t4 
dA 60 hf 2 


(2.89a) 


where a is the Stefan-Boltzmann constant 


o = 


n 


60 hc 2 


5.67 x 10 -8 


W 

m^K^ 


(2.89b) 


By this time, the thoughtful reader should have an important concern ready: Eq. (84) and hence 
Eq. (88) are based on Eq. (72) for the average energy of each oscillator, counted from its ground energy 
Tied 2. However, the radiation power should not depend on the energy origin; why have not we included 
the ground energy of each oscillator into integration (88), as we have done in Eq. (80)? The answer is 
that usual radiation detectors only measure the difference between power P m of the incident radiation 
(say, that of a blackbody surface with temperature T) and their own back-radiation 'P (ml with power 
corresponding to some effective temperature Td of the detector (Fig. 10). But however low Td is, the 
temperature-independent ground state energy contribution Tied 2 to the back radiation is always there. 
Hence, the Tied 2 drops out from the difference, and cannot be detected - at least in this simple way. This 
is the reason why we had the right to ignore this contribution in Eq. (88) - very fortunately, because it 
would lead to the integral’s divergence at its upper limit. However, let me repeat again that the ground- 
state energy of the electromagnetic field oscillators is physically real - and important. 


Boltzmann 

law 


Stephan- 

Boltzmann 

constant 



Fig. 2.10. Generic scheme of 
the electromagnetic radiation 
power measurement. 


53 Note that the heat capacity C v = (8E/8T) V , following from Eq. (88), is proportional to f at any temperature, and 
hence does not obey the trend C v — > const at T — > oo. This is the result of the unlimited growth, with temperature, 
of the number of thermally-exited field oscillators with Tico< T. 

54 See, e.g., MA Eqs. (6.8b), (6.6b), and (2.7b). 
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One more interesting result may be deduced from the free energy F of the electromagnetic 
radiation, which may be also calculated by integration of Eq. (73) over all the modes, with the 
appropriate weight: 


F = J>ln(l - e ~ nC0 ' T ) -> J T ln(l - e~ no},T )^-dco = JYln(l-e 
m a d co A 


-TicolT 


2 A 


V 


CO 

2 3 

n c 


dco. (2.90) 


2 3 

Presenting co dco as d(co )/3, this integral may be readily worked out by parts, and reduced to a table 
integral similar to that in Eq. (88), yielding a surprisingly simple result: 


F = -V 


n 


45/zV 


E 

T ' 


(2.91) 


Now we can use the second of general thermodynamic equations (1.35) to calculate pressure: 


P = 


' dF ^ 
k 8VJt 


n 1 T 4 = _E_ 
45ft 3 c 3 3V ' 


(2.92a) 


Photon 
gas’ 
equation of 
state 


This result might be, of course, derived by the integration of the expression for the forces exerted by 
each mode of the electromagnetic on confining the walls confining it to volume V, 55 but notice how 
much simpler the thermodynamic calculation is. Rewritten in the form, 



(2.92b) 


this result may be considered as the equation of state of the electromagnetic field, i.e. from the quantum- 
mechanical point of view, the photon gas. As we will prove in the next chapter, the equation of state 
(1.44) of the ideal classical gas may be presented in a similar form, but with a coefficient generally 
different from Eq. (92). In particular, according to the equipartition theorem, for an ideal gas of 
nonrelativistic atoms whose internal degrees of freedom are in their ground state, whose whole energy is 
that of three translational “half-degrees of freedom”, E = 377(772), the factor before E is twice larger 
than in Eq. (92). On the other hand, a relativistic treatment of the classical gas shows that Eq. (92) is 
valid for any gas in the ultrarelativistic limit, T » me , where m is the rest mass of the gas particle. 
Evidently, photons (i.e. particles with m = 0) satisfy this condition. 56 


Finally, let me note that Eq. (92) allows the following interesting interpretation. The last of Eqs. 
(1.60), being applied to Eq. (92), shows that in this particular case the grand potential Q equals {-El 3). 
But according to the definition of Q, the first of Eqs. (1.60), this means that the chemical potential of the 
electromagnetic field excitations vanishes: 


F = 


F -El 
N 


= 0 . 


(2.93) 


In Sec. 8 below, we will see that the same result follows from Eq. (72) and the Bose-Einstein 
distribution, and discuss its physical sense. 


55 See, e.g., EM Sec. 9.8. 

56 Please note that according to Eqs. (1.44), (88), and (92), the difference between the equations of state of the 
photon gas and an ideal gas of nonrelativistic particles, expressed in the more usual form - as P = P(V, T), is much 
more dramatic: P oc T 4 V° instead of P oc T 1 V 1 . 
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(ii) Specific heat of solids . The heat capacity of solids is readily measurable, and in the early 
1900s its experimentally observed temperature dependence served as an important test for emerging 
quantum theories. However, theoretical calculation of Cy is not simple, 57 even for isolators whose 
specific heat is due to thermally-induced vibrations of their crystal lattice alone. 58 Indeed, a solid may be 
treated as an elastic continuum only at low relatively frequencies. Such continuum supports three 
different modes of mechanical waves with the same frequency co, that obey similar, linear dispersion 
laws, co = vk, but velocity v = v/ for one of these modes (the longitudinal sound) is higher than that (v t ) of 
two other modes (the transverse sound ). 59 At such frequencies the wave mode density may be described 
by an evident modification of Eq. (83): 


dN = V 


(2nf 


V v 3+ v 3 


A 


4 n(o 2 d(o . 


> J 


(2.94a) 


For what follows, it is convenient to rewrite this relation in a form similar to Eq. (83): 
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(2.94b) 


However, wave theory shows 60 that as frequency co of a sound wave in a periodic structure is 
increased so that its half-wavelength n/k approaches the crystal period d, the dispersion law oik) 
becomes nonlinear before the frequency reaches a maximum at k = nld. To make the things even more 
complex, 3D crystals are generally anisotropic, so that the dispersion law is different in different 
directions of wave propagation. As a result, the exact statistics of thermally excited sound waves, and 
hence the heat capacity of crystals, is rather complex and specific for each particular crystal type. 

In 1912, P. Debye suggested an approximate theory of the temperature dependence of the 
specific heat, which is in a surprisingly good agreement with experiment for many insulators, including 
polycrystalline and amorphous materials. In his model, the linear ( acoustic ) dispersion law co = vk, with 
the effective sound velocity v, defined by the latter of Eqs. (94b), is assumed to be exact all the way up 
to some cutoff frequency cod, the same for all three wave modes. This cutoff frequency may be defined 
by the requirement that the total number of acoustic modes, calculated within this model from Eq. (94b), 


N = V 


1 3 

(2;r) 3 v 3 


J Anco 2 dco 
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2n 2 v 3 


(2.95) 


is equal to the universal number N = 3 nV of degrees of freedom (and hence of independent oscillation 
modes) in a system of nV elastically coupled particles, where n is the atomic density of the crystal, i.e. 
the number of atoms per unit volume. Within this model, Eq. (72) immediately yields the following 
expression for the average energy and specific heat (in thennal equilibrium at temperature T ): 


57 Due to low temperature expansion of solids, the difference between their Cy and C P is small. 

58 In good conductors (e.g., metals), specific heat is contributed (and at low temperatures, dominated) by free 
electrons - see Sec. 3.3 below. 

59 See, e.g., CM Sec. 7.7. 

60 See, e.g., CM Sec. 5.3, in particular Fig. 5.5 and its discussion. 
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where T D = Pkod is called the Debye temperature , 61 and 


v J 




oe -1 



at x — > 0 , 
at x — > oo, 


(2.96) 

(2.97) 


(2.98) 


the Debye function . Red lines in Fig. 1 1 show the temperature dependence of the specific heat c v (per 
atom) within the Debye model. At high temperatures, it approaches a constant value of 3, corresponding 
to energy E = 3nVT, in accordance with the equipartition theorem for each of 3 degrees of freedom of 
each atom. (This model-insensitive value of cv is known as the Dulong-Petit law.) In the opposite limit 
of low temperatures, the specific heat is much smaller: 


12;r 


T 

vAy 


« 1 , 


(2.99) 


reflecting the reduction of the number of excited waves with fi co<T as the temperature is decreased. 




Fig. 2.1 1. Temperature dependence of the specific heat in the Debye (red lines) and Einstein (blue lines) models. 

As a historic curiosity, P. Debye’s work followed one by A. Einstein, who had suggested (in 
1907) a simpler model of crystal vibrations. In this model, all 3 nV independent oscillatory modes of nV 
atoms of the crystal have approximately the same frequency, say a>E, and Eq. (72) immediately yields 


61 In SI units, Debye temperatures T D are of the order of a few hundred K for most simple solids (e.g., close to 430 
K for aluminum and 340 K for copper), with somewhat lower values for crystals with heavy atoms ( — 105 K for 
lead), and reach the highest value -2200 K for diamond with its relatively light atoms and very stiff lattice. 
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E = 3nV- 


hco E 

Puo„/T 


( 2 . 100 ) 


e - 1 

so that the specific heat is functionally similar to Eq. (75): 
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sinh (fico E / 2 T) 


( 2 . 101 ) 


This dependence cy(T) is shown with blue lines in Fig. 1 1 (assuming, for the sake of simplicity, 
ficoE = 7b). At high temperatures, this result does satisfy the universal Dulong-Petit law (cy= 3), but at 
low temperatures the Einstein’s model predicts a much faster (exponential) drop of the specific heart as 
the temperature is reduced. (The difference between the Debye and Einstein models is not too 
spectacular on the linear scale, but in the log-log plot, shown on the right panel of Fig. 11, it is rather 
dramatic. 62 ) The Debye model is in a much better agreement with experimental data for simple, 
monoatomic crystals, thus confirming the conceptual correctness of his wave-based approach. 

Note, however, that when a genius such as A. Einstein makes an error, there is probably some 
deep and important reason behind it. Indeed, crystals with the basic cell consisting of atoms of two or 
more types (such as NaCl, etc.), feature two or more separate branches of the dispersion law co(k) - see, 
e.g.,Fig. 12. 63 



kd / n 


Fig. 2.12. Dispersion relation for longitudinal waves in 
a simple ID model of a solid, with similar interparticle 
distances d, but alternating particle masses, plotted for 
a particular mass ratio r = 5. 


While the lower “acoustic” branch is virtually similar to those for monoatomic crystals, and may 
be approximated by the Debye model, co = vk, reasonably well, the upper (“optical” 64 ) branch does 
approach co = 0 at any k. Moreover, for large values of the atom mass ratio r, the optical branches are 
almost flat, with virtually 7- independent frequencies coo that correspond to simple oscillations of each 
light atom between its heavy counterparts. For thermal excitations of such oscillations, and their 


62 This is why there is a general “rule of thumb” in science: if you plot your data on a linear rather than log scale, 
you better have a good excuse ready. (A valid excuse example: the variable you are plotting changes sign within 
the important range.) 

63 This is the exact solution of a particular ID model of such a crystal - see CM Chapter 5. 

64 This term stems from the fact that at k — > 0, the mechanical waves corresponding to these branches have phase 
velocities v p h = oik) Ik that are much higher than that of the acoustic waves, and may approach the speed of light. 
As a result, these waves can strongly interact with electromagnetic (practically, optical) waves of the same 
frequency, while the acoustic waves cannot. 
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contribution to the specific heat, the Einstein model (with coe = fflo) gives a very good approximation, so 
that the specific heat may be well described by a sum of the Debye and Einstein laws (97) and (101), 
with appropriate weights. 


2.7. Grand canonical ensemble and distribution 

As we have seen, the Gibbs distribution is a very convenient way to calculate statistical and 
thennodynamic properties of systems with a fixed number N of particles. However, for systems in which 
N may vary, another distribution is preferable for some applications. Several examples of such situations 
(as well as the basic thermodynamics of such systems) have already been discussed in Sec. 1.5. Perhaps 
even more importantly, statistical distributions for systems with variable N are also applicable to the 
ensembles of independent particles on a certain single-particle energy level - see the next section. 

With this motivation, let us consider what is called the grand canonical ensemble (Fig. 13). It is 
similar to the canonical ensemble discussed in the previous section (Fig. 6) in all aspects, besides that 
now the system under study and the heat bath (in this case typically called the environment) may 
exchange not only heat but also particles. In all system members of the ensemble, the environments are 
in both the thennal and chemical equilibrium, and their temperatures T and chemical potentials // are 
equal. 



dQ, dS 
dN 


environment 
T, fi 


Fig. 2.13. Member of a grand canonical 
ensemble. 


Now let us assume that the system of interest is also in the chemical and thermal equilibrium 
with its environment. Then using exactly the same arguments as in Sec. 4 (including the specification of 
a microcanonical sub-ensemble with fixed E\ and AA), we may generalize Eq. (55), taking into account 
that entropy S cnv of the environment is now a function of not only its energy E em = Ej, - E nhN , 65 but also 
the number of particles Af = AA - N, with £V and A'V fixed: 


ln w „,n « In M = In g env (E, - E m „ , W - A) + In AT, = S env (E, l - E m „ , N Y — N) + const 



F ss mv 

8E t „, 

E ^ N i. m ' N dN 

env 


E y ,N, 


N + const. 


( 2 . 102 ) 


In order to simplify this relation, we may rewrite Eq. (1.52) in the equivalent form 


65 The additional index in the new notation E m N for the energy of the system of interest reflects the fact that its 
eigenvalue spectrum is generally dependent on the number N of particles in it. 
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dS = -dE + —dV-—dN. 

T T T 

Hence, if entropy S of a system is expressed as a function of E, V, and N, then 
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(2.103) 


(2.104) 


Applying the first one and the last one of these relations to Eq. (102), and using the equality, of the 
temperatures T and chemical potentials // of the system under study and its environment, at their 
equilibrium, discussed in Sec. 1.5, we get 


+^N + const . 


(2.105) 


Again, exactly as at the derivation of the Gibbs distribution in Sec. 4, we may argue that since E,, uN , T 
and // do not depend on the choice of environment’s size, i.e. on Ey and Ny, the probability W m ,n for a 
system to have N particles and be in 777-th quantum state in the whole grand canonical ensemble should 
also obey a relation similar to Eq. (105). As a result, we get the so-called grand canonical distribution : 



W n,N =^ CX P 

' P N ~ E m , N 
| T 

}' 

(2.106) 

Just as in the case of the Gibbs distribution, constant Z G (most often called the grand statistical sum, but 
sometimes the “grand partition function”) should be determined from the probability normalization 
condition, now with the summation of probabilities W m N over all possible values of both m and N : 


Z G = Z eX P’ 

m,N 

pN~E mN \ 

T 


(2.107) 


Now, using the general Eq. (29) to calculate entropy for distribution (106) (exactly like we did it 
for the canonical ensemble), we get the following expression, 


S = -Yw In W =lnZ + — - 

° / , VV m.N m.N 111 ^ G ^ T T 

m,N * * 


(2.108) 


which is evidently a generalization of Eq. (62). 66 We see that now the grand thermodynamic potential Q 
(rather than the free energy F) may be expressed directly via the normalization coefficient Z G : 


n = F - m(N) = E - TS - /u(N) = T In -f = -T In 

^ G m,N [ 


(2.109) 


Finally, solving the last equality for Z G , and plugging the result back into Eq. (106), we can rewrite the 
grand canonical distribution in the form 


66 The average number of particles (N) is of course exactly what was called N in thermodynamics (see Ch. 1), but 

I need to keep this explicit notation here to make a clear distinction between this average value of the variable, 
and its particular values participating in Eqs. ( 1 02)-( 1 10). 


Grand 
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W m » = exp- 


Q + juN-E n 


( 2 . 110 ) 


similar to Eq. (65) for the Gibbs distribution. Indeed, in the particular case when the number N of 
particles is fixed, N= (N), so that Q + juN = Cl + ju(N) = F, Eq. (1 10) is reduced right to Eq. (65). 


2.8. Systems of independent particles 

Now we will use the general statistical distributions discussed above to a simple but very 
important case when each system we are considering consists of many similar particles whose explicit 
(physical) interaction is negligible. As a result, each particular energy value E m>N of such a system may 
be presented as a sum of energies Sk of its particles, where index k numbers single-particle energy levels 
(rather than of the whole system, as index m does). 

Let us start with the classical limit. In classical mechanics, the quantization effects are 
negligible, i.e. there is a virtually infinite number of states k within each finite energy interval. However, 
it is convenient to keep, for the time being, the discrete-state language, with understanding that the 
average number (Nk ) of particles in each of these states, frequently called the state occupancy, is very 
small. In this case, we may apply the Gibbs distribution to the canonical ensemble of single particles, 
and hence use it with the substitution E m _ N — > Sk, so that Eq. (58) becomes 


( 2 . 111 ) 


where constant c should be found from the normalization condition: 

ZAH- (2.H2) 

k 

This is the famous Boltzmann distribution. 61 Despite its superficial similarity to the Gibbs 
distribution (58), let me emphasize the conceptual difference between these two results. The Gibbs 
distribution describes the probability to find the whole system on energy level E m , and it is always valid - 
more exactly, for a canonical ensemble of systems in thermodynamic equilibrium. On the other hand, 
the Boltzmann distribution describes occupancy of an energy level of a single particle, and for systems 
of identical particles is valid only in the classical limit (Nk ) « 1, even if the particles do not interact 
directly. 

The last fact may be surprising, because it may seem that as soon as particles of the system are 
independent, nothing prevents us from using the Gibbs distribution to derive Eq. (Ill), regardless of the 
value of (Nk). This is indeed true if the particles are distinguishable, i.e. may be distinguished from each 
other - say by their fixed spatial positions, or by the states of certain internal degrees of freedom (say, 
spin), or any other “pencil mark”. However, it is an experimental fact that elementary particles of each 
particular type (say, electrons) are identical to each other, i.e. cannot be “pencil-marked”. For such 
particles we have to be more careful: even if they do not interact explicitly, there is still some implicit 


Boltzmann 

distribution 



67 The distribution was first suggested in 1877 by the founding father of statistical physic, L. Boltzmann. For the 
particular case when s is the kinetic energy of a free classical particle (and hence has a continuous spectrum), it is 
reduced to the Maxwell distribution - see Sec. 3.1 below. 
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dependence in their behavior, which is especially evident for the so-called fermions (fundamental 
particles with semi-integer spin) they obey the Pauli exclusion principle that forbids two identical 
particles to be in the same quantum state, even if they do not interact explicitly. 68 

Note that here the term “the same quantum state” carries a heavy meaning load here. For 
example, if two particles are confined to stay in different spatial positions (say, reliably locked in 
different boxes), they are distinguishable even if they are internally identical. Thus the Pauli principle, 
as well as other identity effects such as Bose-Einstein condensation, to be discussed in the next chapter, 
are important only when identical particles may move in the same spatial region. In order to describe 
this case, instead of “identical”, it is much better to use a more precise (though ugly) term 
indistinguishable particles. 69 

In order to take these effects into account, let us examine the effects of nonvanishing occupancy 
(Nk )~ 1 on statistical properties of a system of many non-interacting but indistinguishable particles (at 
the first stage of calculation, either fennions or bosons) in equilibrium, and apply the grand canonical 
distribution (109) to a very interesting particular grand canonical ensemble: a subset of particles in the 
same quantum state k (Fig. 14). 


single-particle energy levels: 



particle#: 1 2 ... j 


£\ 

£o 


Fig. 2.14. Grand canonical 
ensemble of particles in the 
same quantum state (with 
eigenenergy s k ). 


In this ensemble, the role of the environment is played by the particles in all other states k’ ^ k, 
because due to infinitesimal interactions, the particles may change their states. In equilibrium, the 
chemical potential // and temperature T of the system should not depend on the state number k, but the 
grand thermodynamic potential Q* of the chosen particle subset may. Replacing N with N k - the 
particular (not average!) number of particles in k tb state, and the particular energy value E„ hN with s k Nk, 
we may reduce Eq. (109) to 

pN k ~s k N k 
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(2.113) 


68 See, e.g., QM Sec. 8.1. 

69 This invites a natural question: what particles are “elementary enough” for the identity? For example, protons 
and neutrons have an internal structure, in some sense consisting of quarks and gluons; they be considered 
elementary? Next, if protons and neutrons are elementary, are atoms? molecules? What about really large 
molecules (such as proteins)? viruses? The general answer to these questions, given by quantum mechanics (or 
rather experiment :-), is that any particles/systems, no matter how large and complex they are, are identical if they 
have exactly the same internal structure, and also are exactly in the same internal quantum state - for example, in 
the ground state of all their internal degrees of freedom. 
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where the summation should be carried out over all possible values of N k . For the final calculation of 
this sum, the elementary particle type becomes essential. 


In particular, for fermions, obeying the Pauli principle, numbers Nk in Eq. (113)may take only 
two values, either 0 (state k is unoccupied) or 1 (the state is occupied), and the summation gives 


Cl k = -T In 
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(2.114) 


Now the average occupancy may be calculated from the last of Eqs. (1.62) 
replaced with (Nk): 

Fermi- 
Dirac 
distribution 
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(2.115) 


This is the famous Fermi-Dirac distribution, derived in 1926 independently by E. Fermi and P. Dirac. 


On the other hand, bosons do not obey the Pauli principle, and for them numbers Nk can take any 
non-negative integer values. In this case, Eq. (113) turns into the following equality: 
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N k = 0 


This sum is just the usual geometric progression again, which converges if A < 1, giving 
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(2.117) 


Bose- 

Einstein 

distribution 


In this case the average occupancy, again calculated using Eq. (1.62) with N replaced with (Nk), obeys 
the Bose-Einstein distribution. 
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(2.118) 


which was derived in 1924 by S. Bose (for the particular case // = 0) and generalized in 1925 by A. 
Einstein for an arbitrary chemical potential. In particular, comparing Eq. (118) with Eq. (72), we see that 
harmonic oscillator excitations , 70 each with energy Tux>, may be considered as bosons, with zero 
chemical potential. We have already obtained this result (// = 0) in a different way - see Eq. (93). Its 
physical interpretation is that the oscillator excitations may be created inside the system, so that there is 
no energy cost // of moving them into the system from its environment. 


The simple form of Eqs. (115) and (118), and their similarity (besides “only” the difference of 
the signs before unity in their denominators), is one of most beautiful results of physics. This similarity 
should not disguise the facts that the energy dependences of (Nk), given by these two formulas, are 
rather different - see Fig. 15. In the Fermi-Dirac statistics, the average level occupancy is finite (and 


70 As the reader certainly knows, for the electromagnetic field oscillators, such excitations are called photons ; for 
mechanical oscillation modes, phonons. It is important, however, not to confuse these mode excitations with the 
oscillators as such, and be very careful in prescribing to them certain spatial locations - see, e.g., QM Sec. 9.1. 
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below 1) at any energy, while in the Bose-Einstein it may be above 1, and even diverges at Sk — » //.. 
However, for any of these distributions, as temperature is increased, it eventually becomes much larger 
than the difference (pi - fi) for all k. In this limit, (Nk) « 1, both distributions coincide with each other, 
as well as with the Boltzmann distribution (111) with c = exp{p/T\. The last distribution, therefore, 
serves as the high-temperature limit for quantum particles of both sorts. 

A natural question now is how to find the chemical potential p participating in Eqs. (115) and 
(118). In the grand canonical ensemble as such (Fig. 13), with number of particles variable, the value of 
/j is imposed by system’s environment. However, both the Fermi-Dirac and Bose-Einstein distributions 
are also applicable to equilibrium systems with a fixed but large number N of particles. In these 
conditions, the role of the environment for some subset of N’ « N particles is played by the remaining 
N—N’ particles. In this case, p may be found by calculation of ( N) from the corresponding distribution, 
and then requiring it to be equal to the genuine number of particles in the system. In the next section, we 
will perform such calculations for several particular systems. 





















- 4-20 2 4 


ifik -lA IT 


Fig. 2.15. Fermi-Dirac (blue line) 
and Bose-Einstein (red line) 
distributions, and the Boltzmann 
distribution with c = exp {ju/T} 
(black line). 


For those and other applications, it will be convenient for us to have ready expressions for 
entropy S of a general (i.e. not necessarily equilibrium) state of systems of independent Fermi or Bose 
particles, expressed not as a function of W m of the whole system - as Eq. (29) does, but as a function of 
the average occupancy numbers (Nk). For that, let us consider a composite system, each consisting of M 
» 1 similar but distinct component systems, numbered by index m = 1,2, ... M, with independent (i.e. 
not explicitly interacting) particles. We will assume that though in each of M component systems, the 
number Nk’ n) of particles in its k- th quantum state may be different (Fig. 16), but their total number A// X| 
in the composite system is fixed: 

M 

Y^N{ m) = Nf ] . (2.119) 

m = 1 


number of particles on k - th 
single-particle energy level: N ( fi 


component system number: 1 
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(M) 

k 


Sk 


2 ... m ... M 
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Fig. 2.16. Composite system with a certain distribution of 
At (I) particles in A'-th state between M component systems. 
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As a result, the total energy of the composite system is fixed as well, 

M 

Y,K'e k =N^s,= const, (2.120) 

m = 1 

so that an ensemble of many such composite systems (with the same k), in equilibrium, is 
microcanonical. According to Eq. (24a), the average entropy Si t per component system may be 
calculated as 


S t 


= lim 


M,N t —> oo 


I nM, 

M 


( 2 . 121 ) 


where Mk is the number of possible different ways such composite system (with fixed A// X) ) may be 
implemented. 

Let us start the calculation of Mk with Fermi particles - for which the Pauli principle is valid. 
Here the level occupancies N/} m) may be only equal 0 or 1, so that the distribution problem is solvable 
only if A a (Z) < M, and evidently equivalent to the choice of A// 11 balls (in arbitrary order) from the total 
number of M distinct balls. Comparing this formulation with the binomial coefficient definition, 71 we 
immediately have 


M k = M C 


Ml 

N( k ] ~ (M -N^y.N^l' 


From here, using the Stirling fonnula (again, in its simplest form (27)), we get 

Fermion 
entropy 


where 


s„ = -<W t )ln(W t ) - (l - <W t ))ln(l - (Af t » 
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(2.122) 


(2.123) 


(2.124) 


is exactly the average occupancy of the k - th single-particle level in each system that was discussed 
earlier in this section. Since for a Fermi system, (At) is always somewhere between 0 and 1, so that 
entropy (123) is always positive. 

In the Bose case, where the Pauli limitation is not valid, the number Aa (,,,) of particles on the k - th 
level in each of the systems is an arbitrary (positive) integer. Let us consider A/ Z) particles and (M - 1) 
partitions (shown by vertical lines in Fig. 16) between M systems as (M - 1 + Aa (Z) ) similar 
mathematical objects ordered along one axis. Then Mk may be calculated as the number of possible 
ways to distribute the (M - 1) indistinguishable partitions among these (M— 1 + At (I> ) distinct objects, 
i.e. as the following binomial coefficient: 72 


M k 


M+N.-l _ (M-l + Af 1 )! 

M_1 ~ (M-1)!A' S) ! ' 


(2.125) 


71 See, e.g., MA Eq. (2.2). 

72 See also MA Eq. (2.4). 
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Applying the Stirling formula (27) again, we get the following result, 


s„ = ~{N k ) ln(JV t ) + (l + {N t »ln(l + {N t )\ 


(2.126) 


Boson 

entropy 


which again differs from the Fenni case (123) “only” by the signs in the second term, and is valid for 
any positive (Nk). 


Expressions (123) and (126) are valid for an arbitrary (possibly non-equilibrium) case; they may 
be also used for an alternative derivation of the Fermi-Dirac (115) and Bose-Einstein (118) distributions 
valid in equilibrium. For that, we may use the method of Lagrange multipliers, requiring (just like it was 
done in Sec. 2) the total entropy of a system of N independent, similar particles, 

S = (2-127) 

k 


as a function of state occupancies (Nk), to attain its maximum, with the conditions of fixed total number 
of particles N and the total energy E: 

^i{N k ) = N = const, ^ (N k )s k = E = const . (2.128) 

k k 


The completion of this calculation is left for reader’s exercise. 


In the classical limit, when the average occupancies (Nk) of all states are small, both the Fermi 
and Bose expressions for 5) tend to the same limit 


5,=-(JV t )l n(JV t ), for {«*)«!. 


(2.129) 
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entropy 


This expression, frequently referred to as the Boltzmann (or “classical”) entropy, might be also obtained, 
for arbitrary (Nk), directly from Eq. (29) by considering an ensemble of systems, each consisting of just 
one classical particle, so thatfij,, — » £k and W m — > (Nk). Let me emphasize again that for indistinguishable 
particles, such identification is generally (i.e. at (Nk) ~ 1) illegitimate even if they do not interact 
explicitly. As we will see in the next chapter, the indistinguishability affects statistical properties of even 
classical particles. 


2.9. Exercise problems 

2.1 . A famous example of the macroscopic irreversibility was suggested in 1907 by P. Ehrenfest. 
Two dogs share 2 N » 1 fleas. Each flea may jump to another dog, and the rate (i.e. the probability of 
jumping per unit time) T of such an event does not depend on time, and on the location of other fleas. 
Find the time evolution of the average number of fleas on a dog, and of the flea-related part of dogs’ 
entropy (at arbitrary initial conditions), and prove that the entropy can only grow. 73 

2.2 . Use the microcanonical distribution to calculate thermodynamic properties (including 
entropy, all relevant thermodynamic potentials, and heat capacity), of an ensemble of similar two-level 


73 This is essentially a simpler (and funnier :-) version of the particle scattering model used by L. Boltzmann to 
prove his famed El-theorem (1872). Besides all the historic significance of that theorem, the model used by 
Boltzmann (see Sec. 6.2 below) is almost as cartoonish. 
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systems, in thermodynamic equilibrium at temperature T that is comparable with the energy gap A. For 
each variable, sketch its temperature dependence, and find its asymptotic values (or trends) in the low- 
temperature and high-temperature limits. 

Hint : The two-level system is generally defined as any system with just two relevant states 
whose energies, say Eo and E\, are separated by a finite gap A = E\ - Eq. Its most popular (but not the 
only!) example is a spin-lA particle, e.g., an electron, in an external magnetic field. 

2.3 . Solve the previous problem using the Gibbs distribution. Also, calculate the probabilities of 
the energy level occupation, and give physical interpretations of your results, in both temperature limits. 

2.4 . Calculate the low-field magnetic susceptibility y jn of a dilute set of non-interacting, 
spontaneous magnetic dipoles m, in thermal equilibrium at temperature T, within two models: 

(i) the dipole moment m is a classical vector of fixed magnitude mo, but arbitrary orientation, and 

(ii) the dipole moment m belongs to a quantum spin- A particle, and is described by vector 

operator m = , where y is the gyromagnetic ratio, and S is the vector operator of particle’s spin. 74 

Hint: The low-field magnetic susceptibility of an isotropic medium is defined 75 as 

_ SM z 
Zm ~ dw ’ 

where M is the (average) magnetization of a unit volume, and axis z is aligned with the direction of the 
external magnetic field W. 

2.5 . Calculate the low-field magnetic susceptibility of a set of non-interacting, distinguishable 
particles with an arbitrary spin s, neglecting their orbital motion. Compare the result with the solution of 
the previous problem. 

Hint : Quantum mechanics 76 tells us that the Cartesian component m z of the magnetic moment of 
such a particle, in the direction of the applied field, may take (2s +1) values 

m z = yhs m , where s m = -s, -s + l,...,s -1, s , 
where y is the gyromagnetic ratio of the particle, and ti is the Planck’s constant. 

2.6 . * Derive a general expression for the average interaction potential between two similar 
magnetic dipoles with fixed magnitude m but arbitrary orientation, at thermal equilibrium. Spell out the 
result in the low-temperature and high-temperature limits. 

2.1 * Analyze the possibility of using a system of non-interacting spin-lA particles in magnetic 
field for refrigeration. 

Hint : See a footnote in Sec. 1.6. 

74 See, e.g., QM Sec. 4.4. Note that both models assume that the particle’s orbital motion (if any) does not 
contribute to its magnetic moment. 

75 See, e.g., EM Sec. 5.5, in particular Eq. (5.111). 

76 See, e.g., QM Sec. 5.7, in particular Eq. (5.197). 
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2.8 . Use the microcanonical distribution to calculate the average entropy, energy, and pressure of 
a single classical particle of mass m, with no internal degrees of freedom, free to move in volume V, at 
temperature T. 


Hint : Try to make a more accurate calculation than has been done in Sec. 2.2 for the system of N 
hannonic oscillators. For that you will need to know the volume Vd of an ^-dimensional hypersphere of 
the unit radius. To avoid being too cruel, I am giving it to you: 


v d =7r d,2 /r 


Ui 

2 


where T(<^) is the gamma-function. 


77 


2.9 . Solve the previous problem starting from the Gibbs distribution. 

2.10 . Calculate the average energy, entropy, free energy, and the equation of state of a classical 
2D particle (without internal degrees of freedom), free to move within area A, at temperature T, starting 
from: 

(i) the microcanonical distribution, and 

(ii) the Gibbs distribution. 

Hint : Make the appropriate modification of the notion of pressure. 

2.11 . A quantum particle of mass m is confined to free motion along a ID segment of length a. 
Using any approach you like, find the average force the particle exerts on walls of such a “ID quantum 
well” in thermal equilibrium, and analyze its temperature dependence, focusing on the low-temperature 
and high-temperature limits. 

co ~ 2 

Hint : You may consider series &(g) = hn as a known function of £ 78 

n = 1 


2.12 . Rotational properties of diatomic molecules (such as N 2 , CO, etc.) may be reasonably well 
described using a “dumbbell” model: two point particles, of masses mi and m 2 , with a fixed distance d 
between them. Ignoring the translational motion of the molecule as the whole, use this model to 
calculate its heat capacity, and spell out the result in the limits of low and high temperatures. (Quantify 
the conditions.) 

2.13 . Calculate the heat capacity of a diatomic molecule, using the simple model described in 
the previous problem, but now assuming that the rotation is confined to one plane. 79 


77 For its definition and main properties, see, e.g., MA Eqs. (6.6)-(6.9). 

78 It may be reduced to the so-called elliptic theta-function 0 3 (z, r) for a particular case z = 0 - see, e.g., Sec. 16.27 
in the Abramowitz-Stegun handbook cited in MA Sec. 16(ii). However, you do not need that (or any other :-) 
handbook to solve this problem. 

79 This is a reasonable model of the constraints imposed on small atomic groups (e.g., ligands) by their 
environment inside some large molecules. 
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2.14 . Low-temperature specific heat of some solids has a considerable contribution from thermal 
excitation of spin waves, whose dispersion law scales as co oc k at co — > 0. 80 Find the temperature 
dependence of this contribution to Cy at low temperatures and discuss conditions of its experimental 
observation. 


2.15 . A rudimentary “zipper” model of DNA replication is a chain of 
N links that may be either open or closed - see Fig. on the right. Opening a 
link increases system’s energy by A > 0, and a link may change its state 
(either open or close) only if all li nk s to the left of it are already open. 

Calculate the average number of open links at thennal equilibrium, and analyze its temperature 
dependence in detail, especially for the case N » 1 . 

2.16 . An ensemble of classical ID particles of mass m, residing in the potential wells 

U{x) = a\x\ r , withy>0, 

is in thermal equilibrium at temperature T. Calculate the average values of its potential energy U and 
the full energy E using two approaches: 

(i) directly from the Gibbs distribution, and 

(ii) using the virial theorem of classical mechanics. 81 



2.17 . For a thennally-equilibrium ensemble of slightly anharmonic classical ID oscillators, with 
mass m and potential energy 

K 


V ^=2 


2 3 

x +ax . 


with small coefficient a, calculate (x) in the first approximation in low temperature T. 

2.18 . A small conductor (in this context, usually called the single-electron 
box ) is placed between two conducting electrodes, with voltage V applied between 
them. The gap between one of the electrodes and the island is so narrow that 
electrons may tunnel quantum-mechanically through this gap (“weak tunnel 
junction”) - see Fig. on the right. Calculate the average charge of the island as a 
function of V. 

Hint : The quantum-mechanical tunneling of electrons through weak 
junctions 82 between macroscopic conductors, and its subsequent energy relaxation 
inside the conductor, may be considered as a single inelastic (energy-dissipating) event, so that the only 
energy relevant for the thermal equilibrium of the system is its electrostatic potential energy. 



80 Note that by the same dispersion law is typical for elastic bending waves in thin rods - see, e.g., CM Sec. 7.8. 

81 See, e.g., CM Problem 1.12. 

82 In this context, weak junction means a tunnel junction with transparency so low that the tunneling electron’s 
wavefunction looses its quantum-mechanical coherence before the electron has time to tunnel back. In a typical 
junction of a macroscopic area this condition is fulfilled if the effective tunnel resistance of the junction is much 
higher than the quantum unit of resistance (see, e.g., QM Sec. 3.2) , Rq = irhlle 1 ~ 6.5 kQ. 
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2.19 . An LC circuit (see Fig. on the right) is at thermodynamic 

2 1/2 

equilibrium with the environment. Find the r.m.s. fluctuation 8C = (C ) of 

1 /9 

the voltage across it, for an arbitrary ratio T/hco, where co = ( LC )' ~ is the 
resonance frequency of this “tank circuit”. 

2.20 . Derive Eq. (92) from simplistic arguments, representing the blackbody radiation as an ideal 
gas of photons, treated as ultrarelativistic particles. What do similar arguments give for an ideal gas of 
classical, nonrelativistic particles? 



2.21 . Calculate the enthalpy, the entropy, and the Gibbs energy of the blackbody electromagnetic 
radiation with temperature T, and then use these results to find the law of temperature and pressure drop 
at an adiabatic expansion of the radiation. 

2.22 . As was mentioned in Sec. 2.6(i) of the lecture notes, the relation between the visible 
temperatures 7® of Sun’s surface and Earth’s surface T a follows from the balance of the thermal 
radiation they emit. Prove that this relation indeed follows, with a good precision, from a simple model 
in which the surfaces radiate as perfect black bodies with a constant, average temperature. 

Hint : You may pick up the experimental values you need from any (reliable :-) source. 


2.23 . If a surface is not perfectly radiation-absorbing (“black”), the electromagnetic power of its 
thennal radiation differs from the Stefan law (2.89a) by a frequency-dependent factor s < 1, called 
emissivity : 

— = £ctT a . 

A 

Prove that such surface reflects (1 - s) part of incident radiation. 


T <T 
J 2 SJ 1 

1 


For many applications (including low temperature experiments) this flow is detrimental. One way to 
reduce it is to reduce the emissivity e(co ) of both surfaces - say by covering them with shiny metallic 
films. An alternative way toward the same goal is to place, between the surfaces, a thin layer (usually 
called the thermal shield), with a low emissivity of both surfaces, and disconnected from any heat bath - 
see dashed line in Fig. above. Assuming that the emissivity is the same in both cases, find out which 
way is more efficient. 

Hint : The definition of emissivity may be found, for example, in the previous problem. 


2.24 . If two black surfaces, facing each other, have different 
temperatures (see Fig. on the right), then according to the Stefan radiation 
law (2.89), there is a net flow of thermal radiation, from a warmer surface 
to the colder one: 


A 




% 

i 




net 

► 


2.25 . Two parallel, well conducting plates of area A are separated by a free-space gap of a 

1/9 

constant thickness t«A . Calculate the energy of the spontaneous electromagnetic field inside the gap 
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at thermal equilibrium with temperature T. Specify the validity limits of your result. Does the radiation 
push the plates apart? 

2.26 . Use the Debye theory to estimate the specific heat of aluminum at room temperature (say, 
300 K), and express the result in the following popular units: 

(i) eV/K per atom, 

(ii) J/K per mole, and 

(iii) J/K per gram. 

Compare the last number with the experimental value (from a reliable book or online source). 

2.27 . Use the general Eq. (123) to re-derive the Fenni-Dirac distribution (115) for a system in 
equilibrium. 

2.28 . Each of two similar particles, not interacting directly, may take two quantum states, with 
single-particle energies e equal to 0 and A. Write down the statistical sum Z of the system, and use it to 
calculate its average total energy E of the system, for the cases when the particles are: 

(i) distinguishable; 

(ii) indistinguishable fermions; 

(iii) indistinguishable bosons. 

Analyze and interpret the temperature dependence of ( E ) for each case, assuming that A > 0. 

2.29 . Calculate the chemical potential of a system of N » 1 independent fermions, kept at fixed 
temperature T, provided that each particle has two non-degenerate energy levels, separated by gap A. 
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Chapter 3. Ideal and Not-So-Ideal Gases 

In this chapter, the general approaches discussed in the previous chapters are applied to calculate 
statistical and thermodynamic properties of gases, i.e. collections of identical particles {say, atoms or 
molecules ) that are free to move inside a certain volume, either not interacting or weakly interacting 
with each other. 


3.1. Ideal classical gas 

Interactions of typical atoms and molecules are well localized, i.e. rapidly decreasing with 
distance r between them, becoming negligible at certain distance r 0 . In a gas of N particles inside 
volume V, the average distance (/) between the particles is of the order of {VI N) . As a result, if the gas 
density n = N/V ~ (if is much lower than if , i.e. if nr o « 1, the chance for its particles to approach 
each other and interact is rather small. The model in which such interactions are completely ignored is 
called the ideal gas. 

Let us start with a classical ideal gas, which may be defined as the gas in whose behavior the 
quantum effects are negligible. As we saw in Sec. 2.8, the condition of that is to have the average 
occupancy of each quantum state low: 

(N k )« I- (3.1) 

It may seem that we have already found properties of such a system, in particular the equilibrium 
occupancy of its states - see Eq. (2. 1 1 1): 

(N k ) = const x exp j- y j . (3.2) 


In some sense it is true, but we still need, first, to see what exactly does Eq. (2) means for the gas, i.e. a 
system with an essentially continuous energy spectrum, and, second, to show that, rather surprisingly, 
particles’ indistinguishability affects some properties of even classical gases. 

The first of these tasks is evidently the easiest for a gas out of external fields, and with no 
internal degrees of freedom. 1 In this case £k is just the kinetic energy of the particle obeys the isotropic 
and parabolic dispersion law 


e k 



p:+p;+p I 

2m 


(3.3) 


Now we have to use two facts from other fields of physics. First, in quantum mechanics, momentum p 
is associated with wavevector k of the de Broglie wave, p = hk. 2 Second, eigenvalues of k for any 
waves (including de Broglie waves) in free space are uniformly distributed in the momentum space, 
with a constant density of states, given by Eq. (2.82) 


1 In more realistic cases when particles do have internal degrees of freedom, but they are in certain (say, ground) 
quantum states, Eq. (3) is valid as well, with s k referred to the internal ground-state energy. 

2 See, e.g., QM Sec. 1.2. 
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States _ gV ie ^states _ 

d 2 k (2;r) 3 ’ d 2 p (27tfi) 


where g is the degeneracy of particle’s internal states (say, for electrons, the spin degeneracy g = 2). 


Even regardless of the exact proportionality coefficient between dN s t a t es and d p, the very fact of 
this proportionality means that the probability dW to find the particle in a small region d'p = dp \ dp^dp\ 
of the momentum space is proportional to the right-hand part of Eq. (2), with <% given by Eq. (3): 


dW = C exp^ 


2 m T 


\d*p = C expi- Px P ' \d P \dP 2 dPy 


(3.5) 


This is the famous Maxwell distribution , 3 The normalization constant C may be readily found 
from the last fonn of Eq. (5), by requiring that the integral of dW over all the momentum space to equal 
1, and using the equality of all ID integrals over the each Cartesian component pj of the momentum (j = 
1 , 2, 3), which may be reduced to the well-known dimensionless Gaussian integral: 4 



r- r p 2 1 

-3 

+ 00 

c = 

T XP f 2mTj Pl 

= 

(2mT) 1 ' 2 je - d % 

-o° 


= (: InmT ) 3/ . 


(3.6) 


As a sanity check, let us use the Maxwell distribution to calculate the average energy 
corresponding to each half-degree of freedom: 


^-)= I —dW = 


1 2m , 


2m 


+CO Z 

C ‘' ! ^ eXP1 


2mT 


\ d P, 


T'A 

C' 3 l 


exp) 


2 mT 


\dPr 


\4 2 e~* d%. (3.7) 


The last integral 5 equals a/tz/2, so that, finally, 



(3.8) 


This result is (fortunately :-) in agreement with the equipartition theorem (2.48). It also means that the 
r.m.s. velocity of the particles is 




( T \ V1 
3 — 

V m) 


(3.9) 


3 This formula was suggested by J. C. Maxwell as early as in 1860, i.e. well before the Boltzmann and Gibbs 
distributions. Note also that the term “Maxwell distribution” is often associated with the distribution of particle’s 
momentum (or velocity) magnitude, 

dW = AnCp 2 exp j- d j} j d P = 4^Cm 3 v 2 expj- 9 - P> v < 00 ’ 

which immediately follows from Eq. (5) combined with the expression d'p = Arqrdp due to the spherical 
symmetry of the distribution in the momentum/velocity space. 

4 See, e.g., MA Eq. (6.9b). 

5 See, e.g., MA Eq. (6.9c). 


Maxwell 
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26 

For a typical gas (say, N 2 ), with in « 28 m p « 4.7x10'“ kg at room temperature (7 1 = k K T K ~ 
A:bx 300 K « 4.1x10'“ J), this velocity is about 500 m/s - comparable with the sound velocity in the same 
gas (and the muzzle velocity of typical handgun bullets). Still, it is measurable using simple table-top 
equipment (say, a set of two concentric, rapidly rotating cylinders with a thin slit collimating an atomic 
beam emitted at the axis) that was available already in the end of the 19 th century. Experiments using 
such equipment gave convincing confirmations of Maxwell’s theory. 


This is all very simple (isn’t it?), but actually the thermodynamic properties of a classical gas, 
especially its entropy, are more intricate. To show that, let us apply the Gibbs distribution to gas 
portions consisting of N particles each, rather than just one of them. If the particles are exactly similar, 
the eigenenergy spectrum {£*} of each of them is also exactly the same, and each value E m of the total 
energy is just the sum of particular energies £k(i) of the particles, where k(l), with / = 1,2, ... N, is the 
number of the energy level of I th particle. Moreover, since the gas is classical, (ttk) « 1, the probability 
of having two or more particles in any state may be ignored. As a result, we can use Eq. (2.59) to write 


z = £ex P {-^} = D ex 4“2>*(4 = xz-zrf ex p| 


■k(i) 


(3.10) 


k(i) 


ir(l) k(2) k(N ) l 


where the summation has to be carried over all possible states of each particle. Since the summation 
over each set {£(/)} concerns only one of the operands of the product of exponents under the sum, it is 
tempting to complete the calculation as follows: 


z -> z d.st = X cx P 

*0) 



G( 2) 

T 




(3.11) 


where the final summation is over all states of one particle. This fonnula is indeed valid for 
distinguishable particles. 6 However, if particles are indistinguishable (again, meaning that they are 
identical and free to move within the same spatial region), Eq. (11) has to be modified by what is called 
the correct Boltzmann counting : 

Correct 
Boltzmann 
counting 

that considers all quantum states, differing only by particle permutations in the gas portion, as one. 

Now let us take into account that the fundamental relation (4) implies the following rule for the 
replacement of a sum over quantum states with an integral in the classical limit - whose exact conditions 
are still to be specified: 7 


Z = 


N\ 



(3.12) 




. gv 
(2 rtf 


Jfk 3 A=-£^j(...yv. 


{2n Pi) 


(3.13) 


In application to Eq. (12), this rule yields 


6 Since each particle belongs to the same portion of gas, i.e. cannot be distinguished from others by its spatial 
position, this requires some internal “pencil mark”, for example a specific structure or a specific quantum state of 
its internal degrees of freedom. 

7 As a reminder, we have already used this rule (twice) in Sec. 2.6, with particular values of g. 
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z = 


N\\ 


gV 


ex Pi 


2 m T 


\ d Pi 


(3-14) 


J J 


1 12 

The integral in square brackets is the same one as in Eq. (6), i.e. equal to (2mnT) ", so that finally 


Z = 


N\ 


T (27rmT) 3/2 


(2 7th) 


Y i 

gV 

( mT 

3/2 

J Nl 

^2 7th 2 j 



(3.15) 


Now, assuming that N» l, 8 and applying the Stirling formula, we can calculate gas’ free energy, 

F = T\n- = -NT\n — + Nf(T), (3.16a) 

Z N 

with 


f(T) = -T 


In 


g 


/ \ i/2 

' mT A 


v 2 nW j 


+ 1 


(3.16b) 


The first of these relations is exactly Eq. (1.45) which was derived, in Sec. 1.4, from the equation 
of state PV = NT, using thermodynamic identities. At that stage this equation of state was just 
postulated, but now we can finally derive it by calculating pressure from the second of Eqs. (1.35), and 
Eq. (16a): 


P = 


cF_ 

dV 


Jt 


NT 


(3.17) 


So, the equation of state of the ideal classical gas, with density n = N/V, is indeed given by Eq. (1.44): 


P = 


NT 

V 


= nT . 


(3.18) 


Hence we may use Eqs. ( 1 .46)-( 1.51), derived from this equation of state, to calculate all other 
thennodynamic variables of the gas. As one more sanity check, let us start with energy. Using Eq. (1.47) 
wither) given by Eq. (16b), we immediately get 


E = N] 


V 


f - T N 

dT , 


= — NT . 
2 


(3.19) 


in full agreement with Eq. (8) and hence with the equipartition theorem. Much less trivial is the result 
for entropy, which may be obtained by combining Eqs. (1.46) and (15): 


S = - 


8F 

dT 


= N 


Jv 


hA 

N 


df(T) 

dT 


(3.20) 


8 For the opposite limit when N = g = 1, Eq. (15) yields the results obtained, by two alternative methods, in 
Problems 2.5 and 2.6. For N = 1, the “correct Boltzmann counting” factor N\ equals 1, so that the particle 
distinguishability effects vanish - naturally. 
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This formula, 9 in particular, provides the means to resolve the following gas mixing paradox 
(sometimes called the “Gibbs paradox”). Consider two volumes, V\ and V 2 , separated by a partition, 
each filled with the same gas, with the same density n, at the same temperature T. Now let us remove the 
partition and let the gases mix; would the total entropy change? According to Eq. (20), it would not, 
because the ration V/N = n, and hence the expression in square brackets is the same in the initial and the 
final state, so that the entropy is additive (extensive). This makes full sense if the gas particles in the 
both parts of the volume are identical, i.e. the partition’s removal does not change our information about 
the system. However, let us assume that all particles are distinguishable; then the entropy should clearly 
increase, because the mixing would certainly decrease our information about the system, i.e. increase its 
disorder. A quantitative description of this effect may be obtained using Eq. (11). Repeating for Zdi st all 
the calculations made above for Z, we readily get a different formula for entropy: 


S^=N 


lnZ- 


# dlst C o ' 

dT 


/dist CO = ~T In 


S\ 


mT 


(3.21) 


Notice that in contrast to the S given by Eq. (20), entropy Sdist is not proportional to N (at fixed 
temperature T and density N/V). While for distinguishable particles this fact does not present any 
conceptual problem, for indistinguishable particles it would mean that entropy were not an extensive 
variable, i.e. would contradict the basic assumptions of thennodynamics. This fact emphasizes again the 
necessity of the correct Boltzmann counting in the latter case. 

Comparing Eqs. (20) and (21), we can calculate the change of entropy due to mixing of 
distinguishable particles: 

AS„,„ = (AT, + N 2 )h(F, + V 2 )~ (N, In V, +N 2 \nV 2 )= N, In + jv 2 In tEtL > 0 . (3.22) 

*1 '2 


Note that for a particular case, V\ = V 2 = E/2, Eq. (22) reduces to the simple result ASdist = (M + N 2 ) ln2, 
which may be readily understood from the point of view of information theory. Indeed, allowing each 
particle of N = N\ + N 2 to spread to twice larger volume, we loose one bit of information per particle, i.e. 
A I = (N\ + N 2 ) bits for the whole system. 

Let me leave it for the reader to show that result (22) is also valid if particles in each sub-volume 
are indistinguishable from each other, but different from those in another sub-volume, i.e. for mixing of 
two different gases. 10 However, it is certainly not applicable to the system where all particles are 
identical, stressing again that the correct Boltzmann counting (12) does indeed affect entropy, even 
though it is not essential for either the Maxwell distribution (5), or the equation of state (18), or average 
energy (19). 

In this context, one may wonder whether the change (22) (called the mixing entropy) is 
experimentally observable. The answer is yes. For example, after free mixing of two different gases one 
can use a thin movable membrane that is semipermeable, i.e. penetrable by particles of one type only, to 


9 The result presented by Eq. (20), with function / given by Eq. (16b), was obtained independently by O. Sackur 
and H. Tetrode in 1911, i.e. well before the final formulation of quantum mechanics in the late 1920s. 

10 By the way, if an ideal classical gas consists of particles of several different sorts, its full pressure is a sum of 
independent partial pressures exerted by each component - the so-called Dalton law. While this fact was an 
important experimental discovery in the early 1800s, from the point of view of statistical physics this is just a 
straightforward corollary of Eq. (18), because in an ideal gas, the component particles do not interact. 
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separate them again, thus reducing the entropy back to the initial value, and measure either the necessary 
mechanical work A^ / = TASdist or the corresponding heat discharge into the heat bath. Practically, 
measurements of this type are easier in weak solutions, 1 1 systems with a small concentration c « 1 of 
particles of one sort {solute) within much more abundant particles of another sort {solvent). The mixing 
entropy also affects thermodynamics of chemical reactions in gases and liquids. 12 It is curious that 
besides purely thennal measurements, mixing entropy in some conducting solutions {electrolytes) is also 
measurable by a purely electrical method, called cyclic voltammetry, in which a low-frequency ac 
voltage, applied between solid-state electrodes embedded in the solution, is used to periodically separate 
different ions, and then mix them again. 13 

Now let us briefly discuss two generalizations of our results for ideal classical gases. First, let us 
consider the ideal classical gas in an external field of potential forces. It may be described by replacing 
Eq. (3) with 

L + U(r,.), (3.23) 

2m 

where iy- is the position of the particular particle, and U{r) is the potential energy per particle. In this 
cases, Eq. (4) is applicable only to small volumes, V — * dV = err whose linear size is much smaller than 
the spatial scale of variations of macroscopic parameters of the gas- say, pressure. Hence, instead of Eq. 
(5), we may only write the probability dW of finding the particle in a small volume d rd p of the 6- 
dimensional phase space: 

dW = w{r,p)d 3 rd 3 p, w(r,p) = const x exp j- ^ p ~ ^!*^ | • (3-24) 

Hence, the Maxwell distribution of particle velocities is still valid at each point r, and a more interesting 
issue here is the spatial distribution of the total density, 

n{r) = N^w{r,\i)d 3 p , (3.25) 

of all gas particles, regardless of their momentum. For this variable, Eq. (24) yields 14 

n{ r) = /?(()) expj- > , (3.26) 


11 It is interesting that statistical mechanics of weak solutions is very similar to that of ideal gases, with Eq. (18) 
recast into the following formula (derived in 1885 by J. van’t Hoff), PV = cNT, for the partial pressure of the 
solute. One of its corollaries is that the net force (called the osmotic pressure) exerted on a semipermeable 
membrane is proportional to the difference of solute concentrations it is supporting. 

12 Unfortunately, I do not have time for even a brief introduction into this important field, and have to refer the 
interested reader to specialized textbooks - for example, P. A. Rock, Chemical Thermodynamics, University 
Science Books, 1983; or P. Atkins, Physical Chemistry r, 5 th ed.. Freeman, 1994; or G. M. Barrow, Physical 
Chemistry, 6 th ed., McGraw-Hill, 1996. 

13 See, e.g., either Chapter 6 in A. Bard and L. Falkner, Electrochemical Methods, 2 nd ed., Wiley, 2000 (which is a 
good introduction to electrochemistry as the whole); or Sec. II. 8. 3.1 in F. Scholz (ed.), Electroanalvtical Methods, 
2 nd ed., Springer, 2010. 

14 In some textbooks, Eq. (26) is also called the Boltzmann distribution, though it certainly should be 
distinguished from the more general Eq. (2.1 1 1). 
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where the potential energy reference is at the origin. As we will see in Chapter 6, in a non-uniform gas 
the equation of state (18) is valid locally if particles’ mean free path 1 is much smaller than the spatial 
scale of changes of function »(r ). 15 In this case, the local gas pressure may be still calculated from Eq. 
(18): 


P( r) = n(r)T = P( 0)exp 



(3.27) 


An important example of application of Eq. (27) is an approximate description of the Earth 
atmosphere. At all heights h « R E ~ 6xl0 6 m above the Earth’s surface (say, the sea level), we may 
describe the Earth gravity effect by potential U = nigh, and Eq. (27) yields the so-called barometric 
formula 

P{h) = 7(0) expi - — 1, with h 0 = — — = . (3.28) 

[ K ) m § m § 


For the same N 2 (the main component of the atmosphere) at 7k = 300 K, ho ~ 7 km. This gives the right 
order of magnitude of the Earth atmosphere’s thickness, though the exact law of pressure change differs 
somewhat from Eq. (28) because of a certain drop of the absolute temperature 7 with height, by about 
20% at h ~ ho. 16 

The second generalization I would like to mention is to particles with internal degrees of 
freedom. Ignoring, for simplicity, the potential energy U( r), we may describe them by replacing Eq. (3) 
for 


s 


k 



'k ’ 


(3.29) 


where £k ’ describes the internal energy of the k - th particle. If the particles are similar, we may repeat all 
above calculations, and see that all the results (including the Maxwell distribution) are still valid, with 
the only exception of Eq. (16) that now becomes 


f(T) = -T< 


In 


S\ 


mT 

2M 1 


\ 3/2 



+ 1 


(3.30) 


As we already know from Eq. (1.51), this change may affect both heat capacities of the gas, C ( and C P , 
but not their difference (equal to N). 


32. Calculating u 

Now let us return to Eq. (3), i.e. neglect the external field effects, as well as thermal activation of 
the internal degrees of freedom, and discuss properties of ideal gases of indistinguishable quantum 


15 The mean free path may be defined by the geometric relation not = 1, where a is the full cross-section of the 
particle-particle scattering - see, e.g., CM 3.7. 

16 The reason of the drop is that the atmosphere, including molecules such as H 2 0, CO 2 , etc., absorbs Sun’s 
radiation at wavelengths ~500 nm much smaller than those of the back-radiation of the Earth surface, with the 
spectrum centered at wavelength ~10 pm - see Eq. (2.87) and its discussion. 
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particles in more detail, paying special attention to the chemical potential fi - which, as you may recall, 
was a little bit mysterious aspect of the Fermi and Bose distributions. 


Let us start from the classical gas, and recall the conclusion of thermodynamics that // is the 
Gibbs potential per unit particle - see Eq. (1.56). Hence we can calculate // = GIN from Eqs. (1.49) and 
(16b). The result, 


M = 


-Tin — + f(T)+T = Tln 
N 


N 

( 2rfl 2 ') 

3/2 

gv 

l mT J 



(3.31) 


which may be rewritten as 


exp 



_ N 

2^ ^ 

It J 

gv 

{ mT J 


3/2 


(3.32) 


is very important, because it gives us some information about // not only for a classical gas, but for 
quantum (Fermi and Bose) gases as well. Indeed, we already know that for indistinguishable particles 
the Boltzmann distribution (2.111) is valid only if (Nk) « 1. Comparing this condition with quantum 
statistics (2.115) and (2.1 18), we see that the condition of gas’ classicity may be expressed as 



«1 


(3.33) 


for all Sk. Since the lowest value of Sk given by Eq. (3) is zero, Eq. (35) for a gas may be satisfied only if 
cxp{/.//7) « 1. This means that the chemical potential of the classical has to be not just negative, but 
also “strongly negative” in the sense 

- ju»T. (3.34) 

According to Eq. (32), this condition may be presented as 

T»T 0 , (3.35) 


with To defined as 


T = 

1 o — 

m 




S) 


fi 2 . , _ 1 

_ 2 / 3 2 5 W1 th r A — - 1/3 


g mr A 


n 



1/3 

UJ 



(3.36) 


Condition (35) is very transparent physically: disregarding factor g (which is typically not 
much larger than 1), it means that the average thermal energy of a particle (which is of the order of T) 
has to be much larger than the energy of quantization of particle’s motion at length va - the average 
distance between the particles. An alternative fonn of this condition is 17 


v a » § 


- 1 / 3 , 


with r = ■ 


(. mT ) 


1/2 


(3.37) 


17 In quantum mechanics, parameter r c so defined is frequently called the correlation length - see, e.g., QM Sec. 
7.2 and in particular Eq. (7.37). 


Quantum 
scale of 
temperature 
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26 

For a typical gas (say, N 2 , with m « 14 m p ~ 2.3x10'“ kg) at the standard room temperature (T = 
A:bx 300K « 4.1xl0' 21 J), r c « 10' 11 m, i.e. is significantly smaller than the physical size a ~ 3xlO‘ 10 m of 
the molecule. This estimate shows that at room temperature, as soon as any practical gas is rare enough 
to be ideal (r A » a), it is classical, i.e. the only way to observe the quantum effects in the translation 
motion of molecules is a very deep refrigeration. According to Eq. (36), for the same nitrogen molecule, 
taking r A ~ 10 a ~ 10' m (to ensure that direct interaction effects are negligible), T {) should be well 
below 1 pK. 

In order to analyze quantitatively what happens with gases when T is reduced to such low values, 
we need to calculate // for an arbitrary ideal gas of indistinguishable particles. Let us use the lucky fact 
that the Fermi-Dirac and the Bose-Einstein statistics may be represented with one fonnula: 

W)= e(8 J /r±1 . (3-38) 


where (and everywhere in the balance of this section) the top sign stands for fermions and the lower one 
is for bosons, to discuss the fermionic and bosonic gases on the same breath. 

If we deal with a member of the grand canonical ensemble (Fig. 13), in which // is externally 
fixed, we may apply Eq. (39) to calculate the average number N of particles in volume V. If the volume 
is so large that N» 1, we may use the general state counting rule (13): 




gV r d' p 

(2 Ttny J e wp)-t*v T ± 1 


gV | 4 Tip^ dp 

(27rhf\ e [£{p) -^ IT ±\ 


(3.39) 


In most practical cases, however, the number N of gas particles is fixed by particle confinement (i.e. the 
gas portion under study is a member of the canonical ensemble - see Fig. 2.6), and hence // rather than 
N should be calculated. Here comes the main trick: if N is very large, the relative fluctuation of the 
particle number is negligibly small (~ 1 Hn « 1), and the relation between the average values of N and 
// should not depend which of these variables is exactly fixed. Hence, Eq. (39), with // having the sense 
of the average chemical potential, should be valid even if N is exactly fixed, so that small fluctuations of 
N are replaced with (equally small) fluctuations of //. Physically (as was already mentioned in Sec. 2.8), 
in this case the role of the //-fixing environment for any gas sub-portion is played by the rest of the gas, 
and Eq. (39) expresses the condition of self-consistency of such mutual particle exchange. 


In this situation, Eq. (39) may be used for calculating the average // as a function of two 
independent parameters: N (i.e. of the gas density n = N/V) and temperature T. For carrying out this 

calculation, it is convenient to convert the right-hand part of Eq. (39) to an integral over particle’s 

2 1/2 1/2 

energy s(p)=p /2m, so that/? = (2ms) , and dp = (m/2e) ds-. 


Basic 
equation 
for n 


N -SVm™' 

• s xl ds 


,e {s ~ p)IT ±\ 


(3.40) 


This key result may be presented in two more convenient forms. First, Eq. (40), derived for our current 
(3D, isotropic and parabolic-dispersion) approximation (3), is just particular case of a general relation 


A = {g(zr)(Mzr ))////, (3.41) 

0 
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where 


ds 


(3.42) 


is the temperature-independent density of all quantum states of a particle - regardless of whether they 
are occupied or not. Indeed, according to the general Eq. (4), for our simple model (3), 


g{s) = g,{s)= dN ^ =^~ 
ds ds 


4 ngV d{p 3 ) _ gVm 3 7 
( iTthJ 3 y J 3(2nh) 3 ds i2n 2 tr 


r § v ^ 


, 1/2 


(3.43) 


so that we return to Eq. (39). On the other hand, for some calculations, it is convenient to introduce a 
dimensionless energy variable E, = s/T to express Eq. (40) via a dimensionless integral: 


N = 


gVjmT) 
V2 n 2 ti 


3/2 °o 


% l ' 2 d% 


o e 


Z-M/T 


±1 


(3.44) 


As a sanity check, in the classical limit (34), the exponent in the denominator of the fraction 
under the integral is much larger than 1 , and Eq. (44) reduces to 


N = 


g V(mT) r Z ll2 dt gV(mT) 


4ln 2 tr 


P 

o e 


3/2 




V2. n 2 h 3 


-expj-^-jj^ 1/2 e ^cU;, at -ju»T. 


(3.45) 


By the definition of gamma-function f( c), 1 s this dimensionless integral is just f (3/2) = V tt/ 2, and we get 


expj^j = N 


V2 n 2 fv 


gV (mT) 312 -in 


\ 3/2 


2n — 


(3.46) 


which is exactly the same result as given by Eq. (34), which has been obtained in a rather different way 
- from the Boltzmann distribution and thennodynamic identities. 

Unfortunately, in the general case of arbitrary // the integral in Eq. (44) cannot be worked out 
analytically. 19 The best we can do is to use temperature 7 0 , defined by Eq. (37), to rewrite Eq. (44) as 


7 

r 1 i 

f i' 2 dd, 

7 

A 

iltt 2 , 

\e^ IT ±x\ 


-2/3 


(3.47) 


We may use this relation to calculate ratio 777b, and then ratio fi/T 0 = (ju/T)x(T/To), as functions of ju/T 
numerically, and then plot the results versus each other, thinking of the former ratio as the argument. 

Figure 1 shows the resulting plot. It shows that at large temperatures, 7 » To, the chemical 
potential is negative and approaches the classical behavior given by Eq. (46) for both fermions and 
bosons - just as we could expect. For fermions, the reduction of temperature leads to /u changing its sign 
from negative to positive, and then approaching a constant positive value called the Fermi energy, S\ « 
7.595 To at 7 — > 0. On the contrary, the chemical potential of a gas of bosons stays negative, and turns 


18 See, e.g., MA Eq. (6.7a). 

19 For reader’s reference only: for the upper sign, the integral in Eq. (40) is a particular form (for 5 = Vi) of a 
special function called the complete Fermi-Dirac integral F s , while for the lower sign, it is a particular case (for s 
= 3/2) of another special function called the polylogarithm Li s . (In what follows, I will not use these notations.) 
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into zero at certain critical temperature T c « 3.313 To. Both these limits, which are very important for 
applications, may (and will be :-) explored analytically, but separately for each statistics. 



TIT, 


Fig. 3.1. Chemical potential of an ideal gas of 
N » 1 indistinguishable quantum particles, 
as a function of temperature (at fixed gas 
density n = N/V, which fixes parameter T 0 cc 
« 3 2 ), for two different quantum statistics. 
The dashed line shows the classical 
approximation (46) valid at T » T 0 . 


PVm s. E 


Before doing that (in the next two sections), let me show that, rather surprisingly, for any (but 
nonrelativistic!) quantum gas, the product PV expressed in terms of energy, 



(3.48) 


is the same as follows from Eqs. (18) and (19) for the classical gas, and hence does not depend on 
particle’s statistics. In order to prove this, it is sufficient to use Eqs. (2.114) and (2.117) for the grand 
thermodynamic potential of each quantum state, which may be conveniently represented by a single 
formula, 


Q, =+T lnfl±<? (// £k)/T 


(3.49) 


and sum them over all states k, using the general summation formula (13). The result for the total grand 
potential of a 3D gas with the dispersion law (3) is 


Q. = +T- gV 


(27rh)~ 


■M 1 


±e 


(p-p 2 - t2m)/T 


4 Tip dp = +T 


-rr gVm 


3/2 °° 




■M 


±e 


(. ps)IT j^i /2 


ds. (3.50) 


Working out this integral by parts, exactly as we did it with the one in Eq. (2.90), we get 

- 3/2 " s V2 ds 2 r " 


Q = - 


2 gVm 

3 V2^ 3 { e (£ - M)IT ±l~ 3 


I 


\£g^){N(s))d£. 


(3.51) 


But the last integral is just the total energy E of the gas: 


(3.52) 


so that for any temperature and any particle type, Q = -(2/3) E. But since, from thermodynamics, Q = - 
PV, we have Eq. (48) proved. This universal relation will be repeatedly used below. 


Ideal 

gas’ 

energy 


E - SV 

f p 2 4np 2 dp gVm 3 ' 2 ° 

0 3/2 i oo 

(2 nh) 2 \ 

,2m e Wp)-MV T ±l y[ 27 t 2 n 3 | 

(s-p)/T , , J £ gA £ )\ N \ £ )) a£ , 
e ± i o 
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3.3. Degenerate Fermi gas 

The analysis of low-temperature properties of a Fermi gas is very simple in the limit T = 0. 
Indeed, in this limit, the Fermi-Dirac distribution (2.1 15) is just a step function: 



for s < /j, 
for ju <£, 


(3.53) 


2 

- see by the bold line in Fig. 2a. Since e = p 12m is isotropic in the momentum space, this means that at T 
= 0, in that space the particles fully occupy all possible quantum states within a sphere (frequently 
called either the Fermi sphere or the Fermi sea ) with some radius py (Fig. 2b), while all states above the 
sea surface are empty. Such degenerate Fermi gas is a striking manifestation of the Pauli principle: 
though at thermodynamic equilibrium at T = 0 all particles try to lower their energies as much as 
possible, only g of them may occupy each quantum state within the Fermi sphere. As a result, the 
sphere’s volume is proportional to the particle number N, or rather to their density n = N/V. 


(H‘ 


A 

(a) 

\ 

\ 

1 

o 

II 

Fh 


\ T « 


> 



Fig. 3.2. Representation of the 
Fermi sea: (a) on the energy 
axis and (b) in the momentum 
space. 


Indeed, radius p\ may be readily related to the number of particles N using Eq. (40) whose 
integral in this case is just the Fermi sphere volume: 


N = 



gV 4;r 3 
( Ixhf 3 


(3.54) 


Now we can use Eq. (3) to express via N the chemical potential // (which is this limit, F — » 0, bears the 
special name of the Fermi energy £f) 20 : 


£ f — fU\ 


pi 


T = 0 


h 

2 m 2 m 


9 / \ 2/3 

2 ' N A 




gV, 




v 2 , 


7.595 71 


0 > 


(3.55a) 


Fermi 

energy 


where F 0 is the quantum temperature scale defined by Eq. (36). This formula quantifies the low- 
temperature trend of function p(T), clearly visible in Fig. 1, and in particular explains the ratio £\-/T 
mentioned in Sec. 2. Note also a useful and simple relation, 


£ f 


3 N 

2 g 3 (s F y 


(3.55b) 


which may be obtained immediately from Eqs. (43) and (54). 


20 Note that in the electronic engineering literature, p is usually called the Fermi level, at any temperature. 
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The total energy of the gas may be (equally easily) calculated from Eq. (52): 


E = 


, [ —4np 2 dp 

(l/rh) I 2m 


Pf p 2 


gV 4 71 p\ 

{in hf 2m 5 



(3.56) 


showing that the average energy, (s) = E/N, of a particle inside the Fermi sea is equal to 3/5 = 60% of 
that (£f) of the most energetic occupied states, on the Fenni surface. Since, according to the formulas of 
Chapter 1 , at zero temperature H = G = Nju, and F — E, the only thermodynamic variable still to be 
calculated is pressure P. For that, we could use any of thermodynamic relations P = (H— E)/V or P = - 
(dF/dV) T , but it is even easier to use our recent result (48). Together with Eq. (56), it yields 


2 E 2 N 
= —s F — 

3 V 5 V 


^36 x 4 ' 1 ' 3 


125 


3.035P 0 , 


where P 0 = nT 0 


5/3 


mg- 


2/3 


(3.57) 


From here, it is easy to calculate the bulk modulus (reciprocal compressibility ), 21 


K = -V\ 


3P 

dV 


Jt 



N_ 
V ’ 


(3.58) 


which is simpler to measure experimentally. 

Perhaps the most important example 22 of the degenerate Fenni gas are the conduction electrons 
in metals - the electrons that belong to outer shells of the isolated atoms but become common in solid 
metals and can move through the crystal lattice almost freely. Though electrons (which are fermions 
with spin 5 = V 2 and hence the spin degeneracy g = 2s +1=2) are negatively charged, the Coulomb 
interaction of conduction electrons with each other is substantially compensated by the positively 
charged ions of the atomic lattice, so that they follow the simple formulas derived above reasonably 
well. This is especially true for alkali metals (fonning Group 1 of the periodic table of elements), whose 
experimentally measured Fermi surfaces are spherical within 1% even within 0.1% for Na. Table 1 lists, 
in particular, the experimental values of the bulk modulus for such metals, together with the values 
given by Eq. (58) using &) calculated from Eq. (55) with the experimental density of conduction 
electrons. Evidently, the agreement is pretty good, taking into account that the simple theory described 
above completely ignores such factors as the Coulomb and exchange interactions of the electrons. This 
agreement implies that, surprisingly, the rigidity of solids (or at least metals) is predominantly due to the 
kinetic energy of conduction electrons, complemented with the Pauli principle, rather than any 
electrostatic interactions - though, to be fair, these interactions are the crucial factor defining the 
equilibrium value of n. Numerical calculations using more accurate approximations (e.g., the density 
functional theory 23 ) that agree with experiment with a few percent accuracy, confirm this conclusion. 24 


21 See, e.g., CM Eq. (7.39). 

22 Recently, degenerate gases (with e F ~ 5 T) have been formed of weakly interacting Fermi atoms as well - see, 
e.g., K. Aikawa et al., Phys. Rev. Lett. 112 , 010404 (2014) and references therein. 

23 See, e.g., QM Sec. 8.4.' 

24 Note also a huge difference between the very high bulk modulus of metals (K ~ 10 1 1 Pa) and its very low values 
in usual gases (for them, at ambient conditions, K ~10 5 Pa). About 4 orders of magnitude of this difference in due 
to that in particles, density N/V, but the balance is due to the electron gas’ degeneracy. Indeed, in an ideal classical 
gas, K = P = NT/V, so that factor (2/3 )gp in Eq. (58), of the order of a few eV in metals, should be compared with 
factor T~ 25 meV in the atomic gases at room temperature. 


Chapter 3 


Page 13 of 32 





Essential Graduate Physics 


SM: Statistical Mechanics 


Table 3.1. Experimental and theoretical parameters of electron’s Fermi sea in some alkali metals 25 


Metal 

s F (eV) 
Eq. (55) 

K (GPa) 
Eq. (58) 

K (GPa) 
experiment 

2 

/(mcal/mole-K ) 
Eq. (69) 

2 

y(mcal/mole-K“) 

experiment 

Na 

3.24 

923 

642 

0.26 

0.35 

K 

2.12 

319 

281 

0.40 

0.47 

Rb 

1.85 

230 

192 

0.46 

0.58 

Cs 

1.59 

154 

143 

0.53 

0.77 


Now looking at the values of s F listed in the table, note that room temperatures (7k ~ 300 K) 
correspond to T ~ 25 meV. As a result, virtually all experiments with metals, at least in their solid or 
liquid form, are performed in the limit T « £\. According to Eq. (39), at such temperatures the 
occupancy step described by the Fermi-Dirac distribution has a finite but relatively small width ~ T - 
see the dashed line in Fig. 2a. Calculations in this case are much facilitated by the so-called Sommerfeld 
expansion formula 26 for integrals like (40) and (52): 


CO n 2 

I(T) = J q v(e){N(e))de « J tp{s)ds + — T 2 
0 0 6 


d(p(p) 

dp 


at T « p, 


(3.59) 


where (pis) is an arbitrary function that is sufficiently smooth at s = p and integrable at s= 0. In order to 
prove this formula, let us introduce another function 


£ 

f(£) = J (p^)ds’, so that (p{s ) 

o 


df{e) 

ds 


(3.60) 


and work out the integral I(T) by parts: 


I(T) = ]4M{N(e))de 

o 


£-=QO 

j {N(s))df = [{Ms)>/i:” 


]f{s)d(N{s)) = \f{, 


£\ ~ 




£=0 


ds 


ds. (3.61) 


As evident from Eq. (39) and/or Fig. 2a, at T « p, function (-d(N(s))lds) approaches zero for all 
energies, besides a narrow peak, of unit area, at s ~ p. Hence, if we expand function f(s) in the Taylor 
series near this point, just a few leading terms of the expansion should give us a good approximation: 

~ , df , s \d 2 f , , 2 Y a(A(4 A 

,( OO 


/(D*| 


JV 


ds 


ds 


J 


= J (p(s')ds' J 


3(N( S ))' 


ds 


- ds + <p(p)f (s - p) 

J o 


d(.V(s)) 


ds 


ds + 


J 


1 dcp(p) 

2 dp 


\{ £ -/A z 


(3.62) 


ds 


ds. 




25 Data from N. Ashcroft and N. Mermin, Solid State Physics, W. B. Sounders, 1976. 

26 Named after A. Sommerfeld, who was the first (in 1927) to apply the then-emerging quantum mechanics to 
degenerate Fermi gases, in particular to electron in metals, and may be credited for most of the results discussed in 
this section. 
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In the last form of this relation, the first integral over a equals (n(a = 0)) - < n(a = oo) = 1, the second one 
vanishes (because the function under it is asymmetric about point a = /u ), and only the last one needs to 
be dealt with explicitly, by working it out by parts and then using a table integral: 27 




S{N, 

da 


+c0 j f 

da « r 2 f<5 2 — 


y 


d4 


+ 1 ) 


rw 

d£ = AT 2 J- 

o e 


= A T 2 *-. 
* +1 12 


(3.63) 


Being plugged into Eq. (62), this result proves the Sommerfeld formula (59). 

The last preparatory step we need is to take into account a possible small difference (as we will 
see below, also proportional to T 2 ) between the temperature-dependent chemical potential ju(T) and the 
Fenni energy defined as e? = //(()), in the largest (first) term in the right-hand part of Eq. (62), to write 

I{T) * J cp{a)da + (/j-a ¥ )(p(ju) + ^-T 2 = / (0 ) + { M -a F >(//) + ^-T 2 . (3 .64) 

* 6 d/u 6 dji 


Now, applying this fonnula to Eq. (42) and the last form of Eq. (52), we get the following results 
(which are valid for any dispersion law £fp) and even any dimensionality of the gas): 


N(T) = Nd 0) + (ju- a P )g(/u) + ^-T 2 

6 d/u 

(3.65) 

E(T) = E( 0) + ( M -a F W(A) + ^~T 2 -f[//g(//)] . 

6 d/J 

(3.66) 


However, the number of particles does not change with temperature, N(T) = N( 0), so that Eq. (65) gives 
an equation for finding the temperature-induced change of /u: 


fj, a F 


T 2 1 dg(ji) 
6 g(ju) dju 


(3.67) 


Note that the change is quadratic in T and negative, in agreement with the numerical results shown in 
Fig. la. Plugging this expression (which is only valid when the magnitude of the change is much smaller 
than £f) into Eq. (66), we finally get the finite-temperature correction to energy: 

E(T)-E{Q) = ?-g{ l u)T 2 , (3.68) 

6 


where within the accuracy of our approximation, // may be replaced with Sy. (Due to the universal 
relation (48), Eq. (68) also gives the temperature correction to pressure.) Now we may use Eq. (68) to 
calculate the heat capacity of the degenerate Fermi gas: 


Low-T 


1 Si 
1 Pt 


heat 

c v = 


capacity 


1ST) 

V 



(3.69) 


According to Eq. (55b), in the particular case of a 3D gas with the isotropic and parabolic 
dispersion law (3), Eq. (69) reduces to 


27 See, e.g., MA Eqs. (6.8c) and (2.12b), with n= 1. 
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n 2 N n 2 T 

r = - , i.e. C v = ——N — «N. (3.70) 

2 s F 2 s F 

This important result deserves a discussion. First, note that within the range of validity of the 
Sommerfeld approximation (T « sp), the specific heat of the degenerate gas is much smaller than that 
of the classical gas, even without internal degrees of freedom, C v = (3/2 )N- see Eq. (19). The reason for 
such a small heat capacity is that particles deep inside the Fermi sea cannot pick up thermal excitations 
with available energies of the order of T « sp, because all states around them are already occupied. 
The only particles (or rather quantum states) that may be excited with such small energies are those at 
the very Fermi surface, more exactly within a surface layer of thickness As ~ T « sp, and Eq. (69) 
presents a very vivid expression of this fact. 

The second important feature of Eqs. (69)-(70) is the linear dependence of the heat capacity on 
temperature, which decreases with a reduction of T much slower than that of crystal vibrations - see Eq. 
(2.99) and it discussion. This means that in metals the specific heat at temperatures T « 7d is 
dominated by the conduction electrons. Indeed, experiments confirm not only the linear dependence 
(70) of the specific heat, 28 but also the values of the proportionality coefficient y = Cy/T for cases when 
Sp can be calculated independently, for example for alkali metals - see the right two columns of Table 1. 
More typically, Eq. (69) is used for the experimental measurement of the density of states on the Fermi 
surface, g(sp) - the factor which participates in many theoretical results, in particular in transport 
properties of degenerate Fermi gases (see Chapter 6 below). 


3.4. Bose-Einstein condensation 

Now let us explore what happens at cooling of an ideal gas of bosons. Figure 3a shows on a 
more appropriate, log-log scale, the same plot as Fig. lb, i.e. the numerical solution of Eq. (47) with the 
appropriate (negative) sign in the denominator. One can see that that the chemical potential /j indeed 
tends to zero at some finite “critical temperature” T c . This temperature may be found by taking ju = 0 in 
Eq. (47), which is then reduced to a table integral: 29 


Critical 

temperature 


the result explaining the TJTq ratio mentioned in Sec. 2. 

Hence we must have a good look at the temperature interval 0 < T < T c , which may look rather 
mysterious. Indeed, within this range, chemical potential /u cannot be either negative or zero, because 
then Eq. (41) would give a value of N fewer than the number of particles we actually have. On the other 
hand, ju cannot be positive either, because integral (41) would diverge at s — » // due to the divergence of 
(N(s)) - see, e.g., Fig. 2.15. The only possible resolution of the paradox, suggested by A. Einstein, is as 
follows: at T < T c , the chemical potential of each particle still equals exactly zero, but a certain number 
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3.313 T 0 , 


28 Solids, with their low thermal expansion coefficients, present a virtually fixed-volume confinement for the 
electron gas, so that the specific heat measured at ambient conditions may be legitimately compared with 
calculated c v . 

29 See, e.g., MA Eqs. (6.8b) and (6.6c) with s = 3/2. 
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2 

(/Vo of N) of them are in the ground state (with a = p Urn = 0), forming the so-called Bose-Einstein 
condensate, very frequently referred to as BEC. 
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Fig. 3.3. The Bose-Einstein condensation: 
(a) chemical potential of the gas and (b) its 
pressure, as functions of temperature. The 
dashed line corresponds to the classical gas. 


1 /9 

Since the condensate particles do not contribute to Eq. (41) (because of the factor e “ = 0), their 
number No may be calculated by using Eq. (44), with /u = 0, to find the number (N - /Vo) of particles still 

remaining in the gas, i.e. having energy s > 0: 


N-N 0 


gVjmTf' 1 2 °r % U2 d% 
/2n 2 tf {e^-l' 


(3.72) 


This result is even simpler than it may look. Indeed, let us write it for case T= T c , when No = 0: 30 


gV(mTy i2 ^ U2 dg 

yflx-h 3 le^-l' 


(3.73) 


Since the dimensionless integrals in both equations are similar, we may just divide them, getting an 
extremely simple and elegant result: 


N~N 0 
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, so that N 0 = N 
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V c 7 



\ c J 



at T < T 


(3.74a) 


Please note that this result is only valid for the particles whose motion, within volume V, is free - 
in other words, for the particles trapped in a rigid-wall box of volume V. In typical experiments with the 
Bose-Einstein condensation of diluted gases of neutral (and hence weakly interacting) atoms, particles 


30 This is, of course, just another form of Eq. (71). 
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are trapped at the bottom of a “soft” potential well, which may be well approximated by a 3D quadratic 
parabola: U(r) = in co r 12. It is straightforward to show (and hence left for reader’s exercise) that in this 
case the temperature dependence of N 0 is somewhat different: 

, atT<T*, (3.74b) 

where T* is a critical temperature that depends on hen, i.e. the confining potential’s “steepness”, rather 
than on the gas’ volume (which in this case is not fixed). Figure 4 shows one of the first sets of 
experimental data for the Bose-Einstein condensation of dilute gases of neutral atoms. Taking into 
account the finite number of particles in the experiment, the agreement with the simple theory is 
surprisingly good. 


N 0 =N\ 
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Fig. 3.4. Total number N of trapped 87 Rb 
atoms (inset) and their ground-state fraction 
N(/N, as functions of the ratio 777), as 
measured by J. Ensher et al., Phys. Rev. Lett. 
77, 4984 (1996). In this experiment, T* was 
as low as 0.28xl0‘ 6 K. The solid line shows 
the simple theoretical dependence N(T), given 
by Eq. (74b), while other lines correspond to 
more complex theories taking into account the 
finite number N of trapped atoms. © 1996 
APS. 


Now returning to the rigid-wall box model, let us explore what happens at the critical 
temperature and below it with other gas parameters. Equation (52) with the appropriate (lower) sign 
shows that approaching this point from higher temperatures, gas energy and hence its pressure do not 
vanish (Fig. 3b). Indeed, at T= T c (where /u= 0), that equation yields 31 
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-NT * 0.7701 NT, (3.75) 

2 C(3/2) c c 


so that using the universal relation (48), we get a pressure value, 


P(T C ) = 


2 E(T c ) 

3 V 


^(5/2) 
^(3/2) V ° 


0.5134 



1-701 P 0 , 


(3.76) 


which is somewhat lower than, but comparable to P{ 0) for the fermions - cf. Eq. (57). Now we can use 
the same Eq. (52), also with fi = 0, to calculate the energy of the gas at T < T c , 


31 For the involved dimensionless integral see, e.g., MA Eqs. (6.8b) and (6.6c) with s = 5/2. 
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E(T) = gV 


rn l2 T 5 ' 2 ^ V2 d% 

AlAh 2 { e* - r 


(3.77) 


Comparing this relation with the first form of Eq. (75), which features the same integral, we 
immediately get one more simple temperature dependence: 


BEC: 
energy 
below T c 



f t) 

5/2 

E(T) = E(T c ) 

T 

, at T < T c . 


Ac J 



(3.78) 


BEC: 
pressure 
below T c 


From the universal relation (44), we immediately see that pressure follows the same dependence: 



r T ) 

5/2 

P(T) = P(T c ) 

T 

, at T<T c . 


Ac J 



(3.79) 


This temperature dependence of pressure is shown with the blue line in Fig. 3b. The plot shows that for 
all temperatures (both below and above T c ) the pressure of bosonic gas is below that of the classical gas 
of the same density. Note also that since, according to Eqs. (57) and (76), P(T C ) oc P 0 oc V , while, 
according to Eqs. (37) and (71), T c oc T 0 oc V , pressure (79) does not depend on volume at all! The 
physics of this result (that is valid at T < T c only) is that as we decrease the volume at fixed total number 
of particles, more and more of them go to the condensate, decreasing the number (TV- No) of particles in 
the gas phase, but not changing its pressure. Such behavior is very typical for phase transitions - see, in 
particular, the next chapter. 


The last thermodynamic variable of major interest is the heat capacity, because it may be readily 
measured in many systems. For temperatures T < T c , it may be easily calculated from Eq. (78): 


CAT) 


dE 
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= E(T C ) 


5 T V2 

2 F 77 ’ 


(3.80) 


so that below T c , the capacity increases, at the critical temperature reaching the value, 

C v (T c ) = ~ 1.925 N, (3.81) 

which is approximately 28% above that (3TV/2) of the classical gas - in both cases ignoring the possible 
contributions from the internal degrees of freedom. The analysis for T > T c is a little bit more 
cumbersome, because differentiating E over temperature - say, using Eq. (52) - one should also take into 
account the temperature dependence of // that follows from Eq. (40) - see also Fig. lb. However, the 
most important feature of the result may be predicted without the calculation (which is being left for 
reader’s exercise). Since at T » T c the heat capacity has to approach the classical value, it must 
decrease at T> T c , thus forming a sharp maximum (a “cusp”) at the critical point T = T c - see Fig. 5. 

Such a cusp is good indication of the Bose-Einstein condensation in virtually any experimental 
system, especially because inter-particle interactions (unaccounted for in our simple discussion) 
typically make this feature even more substantial, turning it into a weak (logarithmic) singularity. 
Historically, such a singularity (called the A-point because of the characteristic shape of the Cj( 7) 
dependence) was the first noticed, though not immediately understood sign of the Bose-Einstein 
condensation, observed in 1931 by W. Keesom and K. Clusius in liquid 4 He at T= T c « 2.17 K . Other 
milestones of the Bose-Einstein condensation studies include: 
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- the experimental discovery of superconductivity in metals, by H. Kamerlingh-Onnes in 1911; 

- the development of the Bose-Einstein statistics implying the condensation, in 1924-1925; 

- the discovery of superfluidity in liquid 4 He by P. Kapitza and (independently) by J. Allen and 
D. Misener in 1937, and its explanation as a result of the Bose-Einstein condensation by F. and H. 
Londons and L. Titza, with further elaborations by L. Landau (all in 1938); 

- the explanation of superconductivity as the result of formation of Cooper pairs of electrons, 
with an integer total spin, with the simultaneous Bose-Einstein condensation of such effective bosons, 
by J. Bardeen, L. Cooper, and J. Schrieffer in 1957; 

-the discovery of superfluidity of two different phases of 3 He, due to the similar Bose-Einstein 
condensation of pairs of its fermion atoms, by D. Lee, D. Osheroff, and R. Richardson in 1972; 

- the first observation of the Bose-Einstein condensation in dilute gases ( Ru by E. Cornell, C. 
Wieman et al. and 23 Na by W. Ketterle et al.) in 1995. 
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Fig. 3.5. Temperature dependences of the heat 
capacity of an ideal Bose-Einstein gas, 
calculated from Eqs. (52) and (40) for T > T c , 
and from Eq. (80) for T < T c . 


The importance of the last achievement (and of the continuing intensive work in this direction 32 ) 
stems from the fact that in contrast to other Bose-Einstein condensates, in dilute gases (with the typical 
density n as low as ~ 10 14 cm') the particles interact very weakly, and hence many experimental results 
are very close to the simple theory described above and its straightforward elaborations - see, e.g., Fig. 
4. On the other hand, the importance of prior implementations of the Bose-Einstein condensates, which 
involve more complex and challenging physics, should not be underestimated - as it sometimes is. 


The most important feature of any Bose-Einstein condensate is that all No condensed particles 
are in the same quantum state, and hence are described by exactly the same wavefunction. This 
wavefunction is substantially less “feeble” than that of a single particle - in the following sense. In the 
second quantization language, 33 the well- kn own Heisenberg’s uncertainty relation may be rewritten for 
the creation/annihilation operators; in particular, for bosons, 


a, a 


= 1 . 


(3.82) 


32 Its detailed discussion may be found, e.g., in: C. Pethick and H. Smith, Bose-Einstein Condensation in Dilute 
Gases, 2nd ed., Cambridge U. Press, 2008. 

33 See, e.g., QM Sec. 8.3. 
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Since a and a' are quantum-mechanical operators of the complex amplitude a = dexpj/VpJ and its 
complex conjugate a* = Aexp{-i^}, where A and cp are real amplitude and phase of the wavefunction. 
Equation (82) yields the following approximate uncertainty relation (strict in the limit 8(p« 1) between 
the number of particles N= AA* and phase cp 

8N8(p >1/2. (3.83) 

This means that a condensate of N » 1 bosons may be in a state with both phase and amplitude 
of the wavefunction behaving virtually as c -numbers, with negligible relative uncertainties: 8N « N, 
8cp« 1. Moreover, such states are much less susceptible to perturbations by experimental instruments. 
For example, the supercurrent Is carried along a superconducting wires by a coherent Bose-Einstein 
condensate of Cooper pairs may be as high as hundreds of amperes. As a result, the “strange” behaviors 
predicted by the quantum mechanics are not averaged out as in the usual particle ensembles (see, e.g., 
the discussion of the density matrix in Sec. 2.1), but may be directly revealed in macroscopic, 
measurable behaviors of the condensate. 


For example, density jx of the supercurrent may be described by the same fonnula as the usual 
probability current density of a single particle, 34 multiplied by the Cooper pair density n and the electric 
charge q = -2e of a single pair: 


js 


= qu- 


in 


Vcp 


q_ 

h 



(3.84) 


where A is the vector-potential of the (electro)magnetic field. If a superconducting wire is not extremely 
thin, current flow does not penetrate its interior, 35 so that js may be taken for zero. As a result, the 
integral of Eq. (84), taken along a contour inside a closed wire loop yields 


! 

h 



■ dr = A <p = 2 rnn. 


(3.85) 


where m is an integer. But, according to electrodynamics, the integral participating in this equation is 
nothing more than flux ® of the magnetic field S piercing the wire loop area A. Thus we immediately 
arrive at the famous magnetic flux quantization effect 

®=f 3 n d 2 r = m® 0 , ® 0 =^L = JL *2.07xlCT 15 Wb, (3.86) 

a m 2e 

which was theoretically predicted in 1950 and experimentally observed in 1961. Most fantastically, this 
effect holds true even in very large loops, sustained by the Bose-Einstein condensate of Cooper pairs, 
“coherent over miles of dirty lead wire”, citing J. Bardeen’s famous expression. 

Other prominent examples of such macroscopic quantum effects in Bose-Einstein condensates 
include not only the superfluidity and superconductivity as such, but also the Josephson effect, 


34 See, e.g., QM Eq. (3.28). 

35 This is the Meissner-Ochsenfeld (or just “Meissner”) effect which may be also readily explained using Eq. (84), 
combined with the Maxwell equations - see, e.g., EM Sec. 6.3. 
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quantized Abrikosov vortices, etc. Some of these effects are briefly discussed in other parts of this 
series. 36 


3.5. Gases of weakly interacting particles 

Now let us discuss the weak particle interaction effects on macroscopic properties of their gas. 
(Unfortunately, I will have time to do that only for a brief discussion of these effects in classical gases of 
indistinguishable particles. 37 ) 

In most cases of interest, particle interaction may be described by certain potential energy U, so 
that the total energy is 


* = Z 



+ 



•’ r , 


(3.87) 


where r* is the position of k th particle’s center. Let us see how far would the statistical physics allow us 
to proceed for an arbitrary potential U. For N» 1, at the calculation of the Gibbs statistical sum (2.59), 
we may perform the usual transfer from the summation over all quantum states of the system to 
integration over the 6/V-dimensional space, with the correct Boltzmann counting: 




-E IT 


1 g* 


N\ (2 7iti) iN 


J ex P \~Y J ^b:^p v ..d i p N Jexpj- 
k=\ 2/« ‘ l 


U(r lt ...r N ) 


1 JV / I j3 j 3 


dr,.. .d 


1 g N V N 
N\ (2 7dif N 


H-£ 2mr 


> ^3 


d p v ..d p h 


-Lf 
v N J 


exp) 


U(ri,...r^), 3 


(3.88) 


d r v ..d r N 


But according to Eq. (14), the first operand in the last product is the statistical sum of an ideal gas (with 
the same g, N, V, and T), so that we may use Eq. (2.63) to write 


F = F 


ideal 



rin 1 



(3.89) 


where Fideai is the free energy of the ideal gas (i.e. the same gas but with U= 0), given by Eq. (16). 

I believe that Eq. (89) is a very convincing demonstration of the enormous power of the 
statistical physics. Instead of trying to solve an impossibly complex problem of classical dynamics of N 
» 1 (think ofA~ 10" ) interacting particles, and calculating appropriate ensemble averages later on, the 
Gibbs approach reduces finding the free energy (and then, from thermodynamic relations, all other 
thennodynamic variables) to the calculation of just one integral in its right-hand part of Eq. (89). Still, 
this integral is 3A-dimensional and may be worked out analytically only if particle interaction is weak 
in some sense. Indeed, the last fonn of Eq. (89) makes its especially evident that if U — » 0 everywhere, 
the term in parentheses under the integral vanishes, and so does the integral itself, and hence the 
addition to F ic i ea i- 


36 See QM Sec. 2.3, and EM Secs. 6.3 and 6.4. Recently, some of these effects were observed in the Bose- 
Einstein condensates of diluted gases as well. 

37 A concise discussion of weak interactions in quantum gases may be found, for example, in Chapter 10 of K. 
Huang, Statistical Mechanics, 2 nd ed., Wiley, 2003. 
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Now let us see what would this integral yield for the simplest, short-range interactions, in which 
potential U is substantial only when the mutual distance ly • = r y - r y- between particle centers is smaller 
than certain value 2 r 0 , where r 0 may be interpreted as the particle size scale. If the gas in sufficiently 
dilute, so that the particle radius ro is much smaller than the average distance r a between the particles, 
the integral in Eq. (89) is of the order of {2rff N , i.e. much smaller than r/ N ~ F v . Then we may expand 
the logarithm in Eq. (89) into the Taylor series with respect to the small second tenn in the square 
brackets, and keep just the first tenn of the series: 
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F - 

1 ideal 


V 1 


- JdVj ...d 1 


e~ UIT -1 


(3.90) 


Even more importantly, if the gas density is so low that the chances for 3 or more particles to 
come close to each other and interact (collide) are very small, pair collisions are the most important. In 
this case, we may recast the integral in Eq. (90) as a sum of N(N - l)/2 « TV 12 similar terms describing 
such pair interactions, each of the type 

V N - 2 \[e~ U(Xkk ' )IT -1 y 3 r k d\ . (3.91) 


It is convenient to think about ly as the radius-vector of particle k in the reference frame with the origin 
placed at particle k’ — see Fig. 6a. 


(a) 




Fig. 3.6. The definition of 
interparticle distance vectors 
at their (a) pair and (b) triple 
interactions. 


Then it is clear that in Eq. (91), we may first calculate the integral over ly while keeping the 
distance vector ry, and hence t/(ry), constant, getting one more factor V. Moreover, since all particle 
pairs are similar, in the remaining integral over ry we may drop the radius-vector index, so that Eq. (90) 
becomes 

F = F„ _ x y r = F ^ + L N i B(n ( 3.92) 

where B(T), called the second virial coefficient , 38 has an especially simple fonn for spherically- 
symmetric interactions: 


38 Term “virial”, from Latin viris (meaning “force”), was introduced to molecular physics by R. Clausius. The 
motivation for adjective “second” for B(T) is evident from the last form of Eq. (94), with the “first virial 
coefficient”, standing before the N/V ratio and sometimes denoted A(T), equal to 1 - see also Eq. (100) below. 
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(3.93) 
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From Eq. (92), and the second of the thermodynamic relations (1.35), we already can already tell 
something important about the equation of state: 
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(3.94) 


We see that at a fixed gas density n = N/V, the pair interaction creates additional pressure, proportional 
to (N/V) 2 = n 2 and a function of temperature, B(T)T. 

Let us calculate B(T) for a couple of simple models of particle interactions. Solid line in Fig. 7 
shows (schematically) a typical fonn of the interaction potential between electrically neutral molecules 
with zero spontaneous electric dipole momentum. 



Fig. 3.7. Pair interaction of particles. 
Solid line: a typical interaction potential; 
dashed line: its hardball model (95); 
dash-dotted line: the improved model 
(97) - schematically. The inset illustrates 
the idea of the hardball model. 


At large distances the interaction of particles that do not their own permanent electrical dipole 
moment p, is dominated by the attraction (the so-called London dispersion force) between correlated 
components of the spontaneously induced dipole moments, giving U(r) — > r 6 at r — » oo . 39 At closer 
distances the potential is always repulsive ( U > 0) and growing very fast at r — > 0, but its quantitative 
form is specific for each particular molecule. 40 The crudest description of such repulsion is given by the 
so-called hardball model : 


39 Indeed, the independent fluctuation-induced components p(t) and p \t) of dipole moments of two particles have 
random mutual orientation, so that the time average of their interaction energy, proportional to r' 3 , vanishes. 
However, the electric field £ of each dipole p, proportional as r' 3 , induces a correlated component of p also 
proportional to r~ 3 , giving a potential energy of their interaction, proportional to p ■& oc f 6 , with a non-vanishing 
time average. A detailed theory of this effect, closely related to the Casimir effect in quantum mechanics (see, 
e.g., QM Sec. 9.1) may be found, e.g., in Secs. 80-82 of E. Lifshitz and L. Pitaevskii, Statistical Mechanics, pt. 2, 
Pergamon, 1980. 

40 Note that the particular form of the first term in the approximation U(r) = a/r 2 - b/r 6 (called the Lennard-Jones 
potential or the “12-6 potential”), suggested in 1924, lacks physical justification and was soon replaced, in 
professional physics, by better approximations, including the so-called exp-6 model (better fitting most 
experimental data) and the Morse potential (more convenient for quantum-mechanical calculations - see QM 
Chapter 2). However, the Lennard-Jones potential still travels from one undergraduate textbook to another, as a 
trick for simpler calculation of the equilibrium distance between the particles by differentiation. 
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U{r) = 


+ oo, 

0 , 


for 0 < r < 2 r 0 , 
for 2 r 0 < r < oo, 


(3.95) 


- see the dashed line and inset in Fig. 7. According to Eq. (93), in this model the second virial coefficient 
is temperature-independent: 

1 2>o ry 

B(T) = b = - j4w 2 dr = — (2 rj , (3.96) 

2 o ^ 


(and is 4 times larger than the hardball volume Vo = (4tt/ 3 )/+ ), so that the equation of state (94) still 
gives a linear dependence of pressure on temperature. 

A correction to this result may be obtained by the following approximate account of the long- 
range attraction (see the dash-dotted line in Fig. 7): 41 


U(r) = 


+ oo, 
U (. r ), 


for 0 < r < 2 r 0 , 
with |t/| « T, for 2 r 0 < r < oo , 


(3.97) 


which is sometimes called the hard core model. Then Eq. (93) yields: 
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(3.98) 


In this model, the equation of state (94) acquires a temperature-independent term: 
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(3.99) 


2 

Still, the correction to the ideal-gas pressure is proportional to (N/V) , and has to be relatively 
small for Eq. (99) to be valid, so that the right-hand part of Eq. (99) may be considered as the sum of 
two leading terms in the general expansion of P into the Taylor series in low density n= N/V of the gas: 
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(3.100) 


where C(T) is called the third virial coefficient. It is natural to ask how can we calculate C( T) and the 
higher virial coefficients. 

Generally, this may be done just by a careful analysis of Eq. (90), 42 but I would like to use this 
occasion to demonstrate a different, very interesting approach, called the cluster expansion method 


41 The strong inequality between U and T in this model is necessary not only to make calculations simpler. A 
deeper reason is that if (-{/„„„) becomes comparable with, or larger than T, particles may become trapped in the 
potential well formed by this potential, forming a different phase - a liquid or a solid. In such phases, the 
probability to find more than two particles interacting simultaneously is high, so that approximation (92), on 
which all our further results are based, becomes invalid. 

42 L. Boltzmann has used that way to calculate the 3 rd and 4 th virial coefficients for the hardball model - as much 
as can be done analytically. 
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which allows to streamline such calculations. Let us apply to our system, with energy (87), the grand 
canonical distribution. (Just as in Sec. 2, we may argue that if the average number (TV) of particles in a 
member of a grand canonical ensemble, with fixed // and T, is much larger than 1, the relative 
fluctuations of that number are small, so that all its thermodynamic properties should be similar to those 
when N is exactly fixed - as it is assumed when applying the Gibbs distribution valid for the canonical 
ensemble.) For our case, the grand canonical distribution, Eq. (2.109), may be recast as 


Q = -rin|X, 


Z = e^ IT 
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n ,t 


N = 0 


N 2 
k = i 2m 




(3.101) 


(Notice that here, as always in the grand canonical distribution, N means a particular rather than average 
number of particles.) Now, let us try to forget for a second that in real systems of interest the number of 
particles is extremely large, and start to calculate, one by one, the first terms Zv- 


In the term with N= 0, both contributions to E rnN vanish, and so does juN/T, so that Z 0 = 1. In 

the next term, with N = 1, the interaction tenn vanishes, so that E m \ is reduced to kinetic energy of one 
particle, giving 


2, 



jLI 

2 m T 


(3.102) 


Making the usual transition from summation to integration, we may write 


z,=z/„ wh «W ,r J£jrJ 


exp/ 


2mT) 


\d 3 p, 


p, and /, = 1 . 


(3.103) 


This is the same simple (Gaussian) integral as in Eq. (6), giving 


Z = e^ /T -^—{2miT) V2 = e^ T g V { mT 

(2^) 3 


. 3/2 


2nh' 


(3.104) 


Now let us explore the next term, with N = 2, which describes, in particular, pair interactions U = 
U{ r), with r = r - r . Due to the particle indistinguishability, this term needs the “correct Boltzmann 
counting” factor 1/2! - cf. Eqs. (12) and (88): 


Z, = e 


- JM/T 1 


2 ! 


z 

k,k' 


exp/ 


PI 


pi 


2 m T 2 m T 




(3.105) 


Since U is coordinate-dependent, here the transfer from summation to integration should be done more 
carefully than in the first term - cf. Eqs. (24) and (88): 




2 mT 


exp/ 


JL^d 3 p'x-\e U{r)IT d 3 r. (3.106) 

2 mT V J 


Comparing this expression with the definition of parameter Z in Eq. (103), we get 


43 This method was developed in 1937-38 by J. Mayer and collaborators for a classical gas, and generalized to 
quantum systems in 1938 by B. Kahn and G. Uhlenbeck. 
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of pressure 
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(3.107) 


Acting absolutely similarly, for the third tenn of the grand canonical sum we may get 

I i =-^le- Uir ’’ r " VT d , r'd , r", (3.108) 

where r ’ and r ” are the vectors characterizing the mutual positions of 3 particles - see Fig. 6b. 

These result may be extended by induction to an arbitrary N. Plugging the expression for Zv into 
Eq. (101) and recalling that Q = - PV, we get the equation of state of the gas in the form 

f - t 2 7 3 A 


T 

P = — In 
V 


7 7 

1 + 27, + — 7, + — /, + ... 
2! 3! 


(3.109) 


As a sanity check, at U = 0, all integrals In are obviously equal to 1, the expression under the 
logarithm in just the Taylor expansion of e z , giving P = TZ/V, and Q = -PV = -TZ. In this case, 
according to the last of Eqs. (1.62), the average number of particles of particles in the system is ( N) = - 
(dQ. J8ju) t ,v = Z, because since Z oc exp {ju/T}, oZ/G/li = ZIP. Thus, we have happily recovered the 
equation of state of the ideal gas. 44 


Returning to the general case of nonvanishing interactions, let us assume that the logarithm in 
Eq. (109) may be also presented as a Taylor series in Z: 



(3.110) 


(The lower limit of the sum reflects the fact that according to Eq. (109), at Z = 0, P = ( TtV) lnl = 0.) 
According to Eq, (1.60), this expansion corresponds to the grand potential 


J, 


Q = -PV = -T'Y —Z l . 

i\ 

i=\ 


(3.111) 


Again using the last of Eqs. (1.62), we get 


M = Z 


Ji 


(/-!)! 


(3.112) 


This equation, for given (TV), may be used to find Z and hence for the calculation of the equation 
of state from Eq. (110). The only remaining conceptual action item is to express coefficients Ji via the 
integrals / participating in expansion (109). This may be done using the well- kn own Taylor expansion 
of the logarithm function, 45 


44 Actually, the fact that in that case Z = (N), could have been noted earlier by comparing Eq. (104) with Eq. (39). 

45 Looking at Eq. (109), one may think that since £, = Z + Z 2 / 2 /2 +. . . is of the order of at least Z ~ (N) » 1, the 
expansion (113), which converges only if 1 41 < 1 , is illegitimate. However, the expansion is justified by its result 
(1 14), in which the «-th term is of the order of (. N)"{VolV)" A ln \ , so that the series does converge if the gas density 
is sufficiently low: (N)/V « l/F 0 , he. r A » r 0 . This is the very beauty of the cluster expansion, whose few first 
terms present a good approximation even for a gas with (N) » 1 particles. 
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oo £ l 

ln(l + £) = £(-l) w U 


1=1 n 

Using it together with Eq. (109), we get a Taylor series in Z, starting as 

T 2 3 


(3.113) 


p.L 

V 


z + fr (/2 - 0 + it l - 0 - 3(/j ~ *>1 + ••• 


(3.114) 


Comparing this expression with Eq. (110), we see that 


/, = 1 , 




J 2 =I 2 -l = - 
J 3 =(I 3 -\)-3(I 2 -l) 

= ^l( e " C/(r ’ r " )/r -e“ C/(r ' )/r -e-^ r " )/r -e-^ /r + 2UVJV 


(3.115) 


where r'" = r'-r"- see Fig. 6b. The expression of J 2 , describing the pair interactions of particles, is 
(within a numerical factor) equal to the second virial coefficient B(T) - see Eq. (93). As a reminder, the 
subtraction of 1 from integral I 2 in the second of Eqs. (115) makes the contribution of each elementary 
3D volume a r into integral J 2 nonvanishing only if at this r two particles interact (U ^ 0). Very 
similarly, in the last of Eqs. (115), the subtraction of three pair-interaction terms from (I 2 -1) makes the 
contribution from elementary 6D volume d 3 r ’cVr” into integral J 2 finite only if at that mutual location of 
particles all three of them interact simultaneously. 

In order to illustrate the cluster expansion method at work, let us eliminate factor Z from the 
system of equations (110) and (112), keeping (for the sake of simplicity) the terms up to O(Z’) only, as 
has been done in Eq. (114): 


PV J J 3 3 

= J,Z + ^Z 2 +^z 3 + ...,. 

T 2 6 

(. N) = J l Z + J 2 Z 2 +y z ' +••• 


Dividing these two expressions, we get a result, 

PV 1 + (J 2 /2J 1 )Z + (J 3 /6J 1 )Z 2 


N)T 


1 + (J 2 / Jj )Z + (J 3 / 2J X )Z 2 


= 1 - 
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3 J x 
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(3.116) 

(3.117) 

(3.118) 


which is accurate with to terms 0(Z~). In this approximation, we may use Eq. (117), solved for Z with 
the same accuracy: 


J 0 


Z*(N)-f{N) . 

J 1 

Plugging this expression into Eq. (118), we get expansion (100) with 


(3.119) 


J , 

( J 2 J ^ 

J 2 J 3 
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(3.120) 
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The first of these relations, combined with the first two of Eqs. (115), yields, for the 2 nd virial 
coefficient, the same Eq. (93) that was obtained from the Gibbs distribution, while the second one 
allows us to calculate the 3 rd virial coefficient C(T). (Let me leave the calculation of J 3 and C(7), for the 
hardball model, for the reader’s exercise.) Evidently, a more accurate expansion of Eqs. (110), (112), 
and (114) may be used to calculate an arbitrary virial coefficient, though starting from the 5 th 
coefficient, such calculations may be completed only numerically even in the simplest hardball model. 


3.6. Exercise problems 

3.1 . Use the Maxwell distribution for an alternative (statistical) calculation of the mechanical 
work perfonned (per cycle) by the Maxwell-Demon heat engine discussed in Sec. 2.3. 

Hint : You may assume the simplest geometry of the engine - see Fig. 2.4. 

3.2 . * Use the Maxwell distribution to find the damping 
coefficient r/ P = - dP/du, where P is pressure excerted by an ideal 
classical gas on a piston moving with very low velocity u, in the simplest 
geometry shown in Fig. on the right, assuming that collisions of gas 
particles with the piston are elastic. 

3.3 . * An ideal gas of N » 1 classical particles of mass m is confined in a container of volume V. 
At some moment, a very small hole, of area A « V , /" (where 1 is the mean free path of the particles) 
is open in its wall, allowing the particles to escape into the surrounding vacuum. Find the r.m.s. velocity 
of the escaped particles, assuming that the gas stays in a virtual thermal equilibrium at temperature T. 

3.4 . * For the system analyzed in the previous problem, calculate the law of reduction of the 
number of particles in time after opening the hole. 

3.5 . Derive the equation of state of the ideal classical gas from the grand canonical distribution. 

3.6 . * A round cylinder of radius R and length L, containing an ideal classical gas of N » 1 
particles of mass m, is rotated about its symmetry axis with angular velocity <x>. Assuming that the gas 
rotates with the cylinder, and is in the thennodynamic equilibrium at temperature T, 

(i) calculate the gas pressure distribution along its radius, and analyze it temperature dependence, 

(ii) neglecting the internal degrees of freedom of the particles, calculate the total energy of the 
gas and its heat capacity in the high- and low-temperature limits, and 

(iii) formulate the conditions of validity of your result in terms of strong inequalities between the 
following length scales: the quantum correlation length r c = til(mT) , the effective particle size r 0 , the 

2 1/3 

average distance va = (7rR L/N) ~ between the particles, and cylinder’s radius R. 

Hint : One of considerations in (iii) should be the role of particle’s mean free path. 

3.7 . Prove that Eq. (22), derived for the change of entropy at mixing of two ideal classical gases 
of completely distinguishable particles (that had equal densities N/V and temperatures T before mixing) 
is also valid if particles in each of the initial volumes are identical to each other, but different from those 
in the counterpart sub-volume. Assume that masses of all the particles are equal. 
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3.8 . N classical, non-interacting, indistinguishable particles of mass m are confined in a 
parabolic, spherically-symmetric 3D potential well U(r) = ter I. 2. Use two different approaches to 
calculate all major thermodynamic characteristics of the system, in thennal equilibrium at temperature 
T, including its heat capacity. What of the results should be changed if the particles are distinguishable, 
and how? 

Hint : Suggest a reasonable replacement of the notions of volume and pressure for this system. 

3.9 . * Calculate the basic thermodynamic characteristics, including all relevant thermodynamic 
potentials, specific heat, and the surface tension a = ( 8F/dA) T ,N (where A is the system area), for an 
ideal 2D electron gas with given areal density n = N!A\ 

(i) at T = 0, and 

(ii) at low but nonvanishing temperatures (to the first substantial order in T/s f « 1), 
neglecting the Coulomb interaction effects. 


3.10 . How does the Fermi statistics of an ideal gas affect the barometric formula (28)? 


3.11 / Calculate the free carrier density in a semiconductor with bandgap A » T, assuming 
isotropic, parabolic dispersion laws of excitations in its conduction and valence bands. 


Hint : In semiconductor physics, the names of conduction and valence 
bands are used for two adjacent allowed energy bands 46 that at T = 0, all states of 
the valence band are fully occupied by electrons, while the conduction band is 
completely empty - see Fig. on the right. Within the simple model mentioned in 
the assignment (which gives a good approximation for semiconductors of the A 3 B 5 
group, e.g., GaAs) the energy of an electron-like excitation, with its energy in the 
conduction band, follows the isotropic, parabolic law (3.3), but with the origin at 
the band edge Sc, and an effective mass me usually smaller than the free electron 
mass m e . Similarly, the parabolic dispersion law of a single “no-electron” 
excitation (called the hole) in the valence band is offset to the edge of that band, 
Sv, and corresponds to a negative effective mass (-my) - see Fig. on the right: 



\£ c +p'/2m c , fore>£ c , 
\e v -p 2 12m v , for s <s v . 


where s c - s v = A . 


The excitations of both types follow the Fermi-Dirac statistics, and (within this simple model) do not 
interact directly. 


3.12 / Using the same energy band model as in the previous problem, calculate the chemical 
potential and the equilibrium density of electrons and holes in an //-doped semiconductor, with no 
dopants per unit volume. 


46 A discussion of the energy band theory may be found, e.g., in QM Sec. 2.7 and 3.4. Though the reader is highly 
encouraged to review the discussions of this (very important) topic, such a review is not required for solving this 
particular problem. 
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Hint : //-doping means placing, into a semiconductor crystal, a relatively 
small density no of different atoms, called dopants or donors, which may be 
easily single-ionized - in the case of donors, giving an additional electron to 
the semiconductor crystal, with the donor atom becoming a positive ion. As a 
result, the //-doping may be represented as a set of additional electrons with the 
ground state energy on an additional discrete level Sd slightly below the 
conduction band edge sc - see Fig. on the right. 

3.13 . * Generalize the solution of the previous problem to the case when 
the //-doping with dopant density n D is complemented with the simultaneous p- 
doping by acceptor atoms, whose energy of accepting an additional electron 
(and hence the negative ionization) is much less than the bandgap. 

Hint : Similarly to the //-doping, the effect of //-doping may be 
described as an addition of a discrete electron energy level s A , slightly above 
the valence band edge sy- see Fig. on the right. 

3.14 . Calculate the paramagnetic response (the Pauli paramagnetism) of a degenerate ideal gas 
of spin-V 2 particles to a weak external magnetic field, due to partial spin alignment with the field. 

3.15 . * Explore the Thomas -Fermi model 41 of a heavy atom, with nuclear charge Q = Ze » e, in 
which the electrons are treated as a degenerate Fermi gas, interacting with each other only via their 
contribution to the common electrostatic potential <p(r). In particular, derive the ordinary differential 
equation obeyed by the radial distribution of the potential, and use it to estimate the effective radius of 
the atom. 

3.16 . * Use the Thomas-Fermi model, explored in the previous problem, to calculate the total 
binding energy of a heavy atom. Compare the result with that for the simpler model, in which the 
electron-electron interaction is completely ignored. 

3.17 . Derive the general expressions for the calculation of energy E and chemical potential p of a 
Fenni gas of N non-interacting, indistinguishable, ultrarelativistic particles confined in volume V. 48 
Calculate E, and also gas pressure P explicitly in the degenerate gas limit T 0. In particular, is Eq. 
(3.48) of the lecture notes, PV= (2/3) E, valid in this case? 

3.18 . Calculate the pressure of an ideal gas of ultra-relativistic, indistinguishable quantum 
particles, for an arbitrary temperature, as a function of the total energy E of the gas, and its volume V. 
Compare the result with the corresponding relations for the electromagnetic blackbody radiation and an 
ideal gas of nonrelativistic particles. 

3.19 . * Calculate the speed of sound in an ideal gas of ultra-relativistic fermions of density // at 
negligible temperature. 


'C 
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S D ^ 
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X 
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S D ^ 
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47 It was suggested in 1927, independently, by L. Thomas and E. Fermi. 

48 This is, for example, an approximate model for electrons in white dwarf stars, whose Coulomb interaction is 
mostly compensated by the charge of nuclei of fully ionized helium atoms. 
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3.20 . Calculate the effective latent heat A = -N(8Q/8No)n,v, of the Bose-Einstein condensate 
evaporation, as a function of temperature T. Here Q is the heat absorbed by the (condensate + gas) 
system as a whole, while No is the number of particles in the condensate alone. 

3.21 . * For an ideal Bose gas, calculate the law of the chemical potential disappearance at T — > T c , 
and use the result to prove that the gas’ specific heat Cv is a continuous function of temperature at the 
critical point T= T c . 

322. In Chapter 1 of the lecture notes, several thennodynamic equations involving entropy have 
been discussed, including the first of Eqs. (1.39): 

S = -(8G/8T) P . 

If we combine this expression with the fundamental relation (1.56), G = juN, it looks like that for the 
Bose-Einstein condensate, whose chemical potential // vanishes at temperatures below the critical value 
T c , the entropy should vanish as well. On the other hand, dividing both parts of Eq. (1.19) by dT, and 
assuming that at this temperature change the volume is kept constant, we get 

C v = r(8S / 8T) v . 

If CV is known as a function of temperature, the last equation may be integrated to calculate S: 

S = f — ^ dT + const. 

V =const 

•5 / 'J 

For the Bose-Einstein condensate, we have calculated the specific heat to be proportional to T , so that 

'XI') 

the integration gives nonvanishing entropy SecT . Explain this paradox. 

3.23 . The standard approach to the Bose-Einstein condensation, outlined in Sec. 4, may seem to 
ignore the energy quantization of particles confined in volume V. Use the particular case of a cubic 
confining volume V = axaxa with rigid walls to analyze whether the main conclusions of the standard 
theory, in particular Eq. (71) for the critical temperature, are affected by such quantization. 

3.24 . * An ideal 3D Bose gas of N » 1 non-interacting particles is confined at the bottom of a 
soft, spherically-symmetric potential well, whose potential may be approximated as U(r) = mco r /2. 
Develop the theory of the Bose-Einstein condensation in this system; in particular, prove Eq. (3.74b) of 
the lecture notes, and calculate the critical temperature T c . Looking at the solution, what is the most 
straightforward way to detect the condensation? 

3.25 . Calculate the chemical potential of an ideal 2D gas of spin-0 Bose particles as a function of 
its areal density n (the number of particles per unit area), and find out whether such a gas can condense 
at low temperatures. 

3.26 . * Use Eqs. (115) and (120) to calculate the third virial coefficient C(T) for the hardball 
model of particle interactions. 
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Chapter 4. Phase Transitions 

This chapter is a brief discussion of coexistence between different states (“phases”) of collections of the 
same particles, and the laws of transitions between these phases. Due to the complexity of these 
phenomena, which involve interaction of the particles, quantitative results have been obtained only for a 
few very simple models, typically giving only a very approximate description of real systems. 


4.1. First-order phase transitions 

From everyday experience, say with ice, water, and water vapor, we know that one chemical 
substance (i.e. a set of many similar particles) may exist in several stable states - phases. A typical 
substance may have: 

(i) a dense solid phase, in which interatomic forces keep all atoms in fixed relative positions, 
with just small thermal fluctuations about them; 

(ii) a liquid phase, of comparable density, in which the relative distances between atoms or 
molecules are almost constant, but the particles are free to move about each other, and 

(iii) the gas phase, typically of a much lower density, in which the molecules are virtually free to 
move all around the containing volume. 1 

Experience also tells us that at certain conditions, two phases may be in thermal and chemical 
equilibrium - say, ice floating on water, with temperature at the freezing point. Actually, in Sec. 3.4 we 
already discussed a qualitative theory of one such equilibrium, the Bose-Einstein condensate 
coexistence with the uncondensed “vapor” of similar particles. However, this is a rather rare case when 
the phase coexistence is due to the quantum nature of particles (bosons) that may not interact directly. 
Much more frequently, the fonnation of different phases, and transitions between them, is an essentially 
classical effect due to particle interactions. 

Phase transitions are sometimes classified by their order. 2 I will start my discussion with the 
first-order phase transitions that feature non-vanishing latent heat A - the amount of heat that is 
necessary to give to one phase in order to turn it into another phase, even if temperature and pressure are 
kept constant. 3 Let us discuss the most simple and popular phenomenological model of the first-order 
phase transition, suggested in 1873 by J. van der Waals. 

In the last chapter, we have derived Eq. (3.99) for the classical gas of weakly interacting 
particles, which takes into account (albeit approximately) both interaction components necessary for a 


1 The plasma phase, in which atoms are partly or completely ionized, in frequently mentioned in the same breath 
as the three phase listed above, but one has to remember that in contrast to them, a typical electroneutral plasma 
consists of particles of two different sorts - ions and electrons. 

2 Such classification schemes, started by P. Ehrenfest, have been repeatedly modified, and only the “first-order 
phase transition” is still a generally accepted term. 

3 For example, for water the latent heat of vaporization at ambient pressure is as high as ~2.2xl0 6 J/kg, i.e. ~ 0.4 
eV per molecule, making this liquid indispensable for many practical purposes - including fire fighting. (The 
latent heat of water’s ice melting is an order of magnitude lower.) 
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realistic discussion of gas condensation - the long-range attraction of the particles and their short-range 
repulsion. Let us rewrite that result as follows: 


P + a 


N NT 


f 


r 


v 


, Nb 
1 + — 

V V 


(4.1) 


As we saw in Sec. 3.5, the physical meaning of constant b is the effective volume of space taken 
by a particle pair collision. Equation (1) is quantitatively valid only if the second term in the parentheses 
is small, Nb « V, i.e. if the total volume excluded from particles’ free motion because of their collisions 
is much smaller than the whole volume V of the system. In order to describe the condensed phase (which 
I will call “liquid”), 4 we need to generalize this relation to the case Nb ~ V. Since the effective volume 
left for particles’ motion is V— Nb, it is very natural to make the following replacement: V — » V -Nb, in 
the ideal gas’ equation of state. If we still keep the term aN 2 /V 2 , which describes the long-range 
attraction of particles, we get the van der Waals equation 


P + 



NT 

V-Nb' 


(4.2) 


The advantage of this simple model is that in the rare gas limit, Nb « V, it reduces back to Eq. (1). (To 
check this, it is sufficient to Taylor-expand the right-hand part of Eq. (2) in small parameter Nb/V « 1, 
and retain only two leading terms corresponding to two first virial coefficients.) Let us explore 
properties of this model. 


It is frequently convenient to discuss any equation of state in terms of its isotherms, i.e. P(V) 
curves plotted at constant T. As Eq. (2) shows, in the van der Waals model such a plot depends on 4 
parameters {a, b, N, and T.) However, for its analysis it is convenient to introduce dimensionless 
variables: pressure p = P/P,, volume v = V/V c , and temperature t = T/T c , normalized to their so-called 
critical values, 


P = —4’ V c = 3M>, T = — - . 
c 21 b 2 27 b 


(4.3) 


In these notations, Eq. (2) acquires the following form, 
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3 ^ 
P + — 
v j 


8 1 

(3v - 1) ’ 


(4.4) 


so that the normalized isotherms p{v) depend on only one parameter, the normalized temperature t - see 
Fig. 1. The most important property of these plots is that the isotherms have qualitatively different 
shapes in two temperature regions. 5 At t > 1, i.e. T > T c , pressure increases monotonically at gas 
compression (just like in an ideal gas, to which this system tends at T » T c ), i.e. ( dP/dV)T < 0 at all 
points of the isotherm. However, below the critical temperature T c , all isotherms feature segments with 
(i dP/dV)r >0. It is easy to understand that, as least in a constant pressure experiment (see, for example, 


4 Due to the phenomenological character of the van der Waals model, one cannot say whether the condensed 
phase it predicts corresponds to a liquid or a solid. However, in most real substances at ambient conditions, gas 
coexists with liquid, hence the name I will use. 

5 The special choice of numerical coefficients in Eq. (3) is motivated by making the border between two regions 
to take place exactly at t = 1, i.e. at temperature T c , with the critical point coordinates equal to P, and V„. 
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conditions 


Fig. 1.5), 6 these segments describe a mechanically unstable equilibrium. Indeed, if due to a random 
fluctuation, the volume deviated upward from the equilibrium value, the pressure would also increase, 
forcing the environment (say, the heavy piston in Fig. 1.5) to allow a further expansion of the system, 
leading to even higher pressure, etc. A similar deviation of volume downward would lead to a similar 
avalanche-like decrease of the volume. Such avalanche instability would develop further and further 
until the system has reached one of the stable branches with a negative slope (dP/dV)r. In the range 
where the single-phase equilibrium state is unstable, the system as a whole may be stable only if it 
consists of the two phases (one with a smaller, and another with a higher density n = N/V) that are 
described by the two stable branches. 



Fig. 4.1. The van der Waals equation 
plotted on the \p, v] plane for several values 
of reduced temperature t = T !T C . Shading 
shows the single-phase instability range in 
which (dP/dV)r> 0. (The reader is invited 
to contemplate the physical sense and 
possibility of experimental observation of 
the negative values of pressure, predicted 
by the model.) 


In order to understand the basic properties of this two-phase system, let us recall the general 
conditions of equilibrium of two thermodynamic systems, which have been discussed in Chapter 1 : 


Ij = T 2 (thermal equilibrium), 
//j = /j 2 (“chemical” equilibrium), 


(4.5) 

(4.6) 


the latter condition meaning that the average energy of a single (“probe”) particle in both systems is the 
same. To those, we should add the evident condition of mechanical equilibrium, 


P i = P 2 (mechanical equilibrium), 


(4.7) 


that immediately follows from the balance of normal forces exerted on an inter-phase boundary. 


If we discuss isotherms, Eq. (5) is fulfdled automatically, while Eq. (7) means that the effective 
isotherm P(V) describing a two-phase system should be a horizontal line (Fig. 2): 7 


6 Actually, this assumption is not crucial for our analysis of mechanical stability, because if a fluctuation takes 
place in a small part of the total volume V, its other parts play the role of pressure-fixing environment. 

7 Frequently, especially for water gas diluted in air {vapor), P»{T) is called the saturated vapor pressure, while the 
temperature at which P t) ( T) equals to the ambient pressure, is called the dew point, and its frequently used for an 
implicit characterization of the concentration n = N/V of water vapor in air. 


Chapter 4 


Page 3 of 36 


Essential Graduate Physics 


SM: Statistical Mechanics 


P = P 0 (T). 


(4.8) 



Fig. 4.2. Phase equilibrium 
at T < T c (schematically). 


Along this line, internal properties of each phase do not change; only the particle distribution is: 
it evolves gradually from all particles being in the liquid phase at point 1 to all particles being in the gas 
phase at point 2. 8 In particular, according to Eq. (6), the chemical potentials // of the phases should be 
equal at each point of the horizontal line (8). This fact enables us to find the line’s position: it has to 
connect points 1 and 2 in which the chemical potentials of the phases are equal to each other. 

Let us recast this condition as 


2 


J dju = 0, 


i.e. 


2 


f dG = 0 , 


(4.9) 


where the integral may be taken along the single-phase isotherm. (For this mathematical calculation, the 
mechanical instability of some states on this curve is not important.) Along that curve, N = const and T 
= const, so that according to Eq. (1.53c), dG = -SdT + VdP +judN, for a slow (reversible) change, dG = 
VdP. Hence Eq. (9) yields 

2 

jVdP = 0. (4.10) 


From Fig. 2, it is easy to see that geometrically this equality means that the shaded areas A c j and A u 
should be equal, and hence Eq. (10) may be rewritten in the form of the so-called Maxwell’s rule 


Maxwell’s 

rule 



8 An important question is: why does the phase-equilibrium line P = Pq(T) stretch all the way from point 1 to 
point 2 (Fig. 2)? Indeed, branches 1-1 ’ and 2-2 ’ of the single-phase isotherm have negative derivative ( dP/dV) T and 
hence are mechanically stable to small perturbations. The answer is that these branches are actually metastable, 
i.e. have larger Gibbs energy per particle (i.e. jd) than the counterpart phase and are hence unstable to larger 
perturbations - such as foreign microparticles (say, dust), confining wall protrusions, etc. In very controlled 
conditions, these single-phase “superheated” or “supercooled” states can survive virtually all the way to zero- 
derivative points 1 ’ and 2 ’, leading to sudden jumps of the system into the counterpart phase. (For fixed-pressure 
conditions, such jumps are shown by dashed lines in Fig. 2.) However, at more realistic conditions, perturbations 
result in the two-phase coexistence extending all the way between (or very close to) points 1 and 2. 
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This relation is more convenient for calculations than Eq. (10) if the equation of state may be 
explicitly solved for P - as it is the case for the van der Waals equation (2). Such calculation (left for 
reader’s exercise) shows that for that model, the temperature dependence of the saturated gas pressure at 
low T is exponential, 

F 0 (r) = 27/;,exp|- ( ^J, with u a =^ = P-T', torT«T r , (4.12) 


corresponding very well to the physical picture of the rate of particle activation from the potential well 
of depth Uo. 9 


Latent 

heat: 

definition 


The signature parameter of the first-order phase transition, the latent heat of evaporation 



(4.13) 


may be found by a similar integration along the single-phase isotherm. Indeed, using Eq. (1.19), dQ = 
TdS, we get 


A = 



(4.14) 


Instead of calculating entropy from the equation of state (as was done for the ideal gas in Sec. 1.4), it is 
easier to express the right-hand side of Eq. (14) directly via that equation. For that, let us take the full 
derivative of Eq. (6) over temperature, considering each value of G = Nju as a function of P and T, and 
taking into account that according to Eq. (7), P\ = P 2 = Po(T): 


(dGA 

+ 

(dGf 

11 

^3 

( dG 

+ 

(VGA 

U T ) 

P 

l dp ) 

T dT 

U T ) 

P 

{dP) 


dP 0 

dT 


(4.15) 


According to the first of Eqs. (1.39), the partial derivative ( dG/dT)p is just minus entropy, while 
according to the second of those equations, (dG/8P) T is the volume. Thus Eq. (15) becomes 


-S l +V 1 ^ = -S 2 +V 2 
1 1 dT 22 


dP 0 

dT 


(4.16) 


Solving this equation for (S 2 - Si), and plugging the result into Eq. (14), we get the Clapeyron-CIausius 
formula 

Clapeyron- 
CIausius 
formula 

For the van der Waals model, this formula may be readily used for the analytical calculation of A may in 
two limits: T « T c and (T i: - T) « T c - the exercise left for the reader. In the latter limit, A oc (T c - 

1/9 

T) , naturally vanishing at the critical temperature. 

Finally, some important properties of the van der Waals’ model may be revealed more easily by 
looking at the set of its isochores P = P(T) for V = const, rather than at the isotherms. Indeed, as Eq. (2) 
shows, all single-phase isochores are straight lines. However, if we interrupt these lines at the points 


A = T(V 1 -V 1 ) dP ° 


dT 


(4.17) 


9 It is fascinating how well is this Arrhenius exponent hidden in the polynomial van der Waals equation (2)! 
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when the single phase becomes metastable, and complement them with the (very nonlinear!) 
dependence Po(T), we get the pattern (called the phase diagram ) shown schematically in Fig. 3a. 


(a) (b) 




Fig. 4.3. (a) The van der Waals model’s isochores, the saturated gas pressure diagram and 
the critical point, and (b) the phase diagram of a typical 3-phase system (schematically). 


At this plot, one more meaning of the critical point {P c , T c \ becomes very clear. At fixed 
pressure P < P c , the liquid and gaseous phases are clearly separated by the transition line Po(T), so if we 
achieve the transition just by changing temperature, and hence volume (shown with the red line in Fig. 
3), we will pass through the phase coexistence stage. However, if we perform the same final transition 
by changing both the pressure and temperature, going around above the critical point (the blue line in 
Fig. 3), no definite point of transition may be observed: the substance stays in a single phase, and it is a 
subjective judgment of the observer in which region that phase should be called the liquid, and which 
region the gas. For water, the critical point corresponds to 647 K (374°C) and P c ~ 22.1 MPa (i.e. ~200 
bars), so that a lecture demonstration of its critical behavior would require substantial safety 
precautions. This is why such demonstrations are typically carried out with other fluids such as the 
diethyl ether, 10 with much lower T c (194 °C) and P c (3.6 MPa). Though the ether is colorless and clear in 
both gas and liquid phases, their separation (due to gravity) is visible (due to a difference in an optical 
refraction coefficient) at P < P c , but not above P c . * 1 1 

Thus, in the van der Waals model, two phases may coexist, though only at certain conditions ( P 
< P c ). Now the natural question is whether the coexistence of more than two phases of the same 
substance is possible. For example, can the water ice, liquid water, and water vapor (steam) be in 
thermodynamic equilibrium? The answer is essentially given by Eq. (6). From thennodynamics, we 
know that for a uniform system (with G = pN), pressure and temperature completely define the chemical 
potential. Hence, dealing with two phases, we have to satisfy just one chemical equilibrium condition 
(6) for two common parameters P and T. Evidently, this leaves us with one extra degree of freedom, so 


10 (CH3-CH 2 )-0-(CH2-CH 3 ) , historically the first popular general anesthetic. 

11 It is interesting that very close to the critical point the substance suddenly becomes opaque - in the case of 
ether, whitish. The qualitative explanation of this effect, called the critical opalescence, is simple: at this point the 
difference of Gibbs energies per particle (i.e. chemical potentials) of the two phases becomes so small that the 
unavoidable thermal fluctuations lead to spontaneous appearance and disappearance of relatively large (a-few- 
pm-scale) single -phase regions in all the volume. Since the optical refraction coefficients of the phases are 
slightly different, large concentration of the region boundaries leads to strong light scattering. 
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that the two-phase equilibrium is possible within a certain range of P at fixed T (or vice versa) - see Fig. 
3a. Now, if we want three phases to be in equilibrium, we need to satisfy two equations for these 
variables: 


Ml (P,T) = p 2 (P,T) = p,{P,T). (4.18) 

Typically, functions p(P, T) are monotonic, so that Eqs. (18) have just one solution, the so-called triple 
point {P t , T t ) . Of course, the triple point {P t , T t ) of equilibrium between three phases should not to be 
confused with the critical points {P c , T c ) of transitions between two phase pairs. Fig. 3b shows, very 
schematically, their relation for a typical three-phase system solid-liquid-gas. For example, water, ice, 
and water vapor are at equilibrium at a triple point corresponding to 0.612 KPa and (by definition, 
exactly) 273.16 K. 12 The particular importance of this particular temperature point is that by an 
international agreement it has been accepted for the Celsius scale definition, as 0.0 1°C, so that the 
absolute temperature zero corresponds to exactly -273.15°C. More generally, triple points of pure 
substances (such as FE, N 2 , O 2 , Ar, Fig, and FEO) are broadly used for thermometer calibration, defining 
the so-called international temperature scales including the currently accepted scale ITS-90. 

This result may be readily generalized to multi-component systems consisting of particles of 
several (say, L ) sorts. 13 If such a system is in a single phase, i.e. macroscopically uniform, its chemical 
potential may be defined by the natural generalization of Eq. (1.53c): 

L 

dG = -SdT + VdP + Y, p {,] dN (l) . (4. 1 9) 

M 

Typically, a single phase is not a pure substance, but has certain concentrations of other components, so 
that p J) may depend not only on P and T, but also on concentrations c U] = A^V/V of particles of each sort. 
If the total number N of particles is fixed, the number of independent concentrations is ( L - 1). For the 
chemical equilibrium of R phases, all R values of p, {1) (r = 1,2, ..., R) have to be equal for particles of 
each sort: p\ (l) = p 2 (l) = ... = Pr\ with each p r [l) depending on ( L - 1) concentrations c, (/> , and also on P 
and T. This requirement gives L(R - 1) equations for (L -1)7? concentrations c, (/) , plus two common 
arguments P and T, i.e. for [(L -1)7? + 2] independent variables. This means that the number of phases 
has to satisfy the limitation 

(4.20) 

where the equality sign may be reached in just one point in the whole parameter space. This is the Gibbs 
phase rule. As a sanity check, for a single-component system, L = 1, the rule yields 7? < 3 - exactly the 
result we have already discussed. 


Gibbs 

phase 


L{R -!)<(£- 1)7? + 2, i.e. R<L + 2. 


4.2. Continuous phase transitions 

As Fig. 2 shows, if we fix pressure P in a system with a first-order phase transition, and start 
changing its temperature, crossing the transition point, defined by equation Po(T) = P, requires the 


12 Please note that P, for water is several orders of magnitude lower than P, of the water-vapor transition, so that 
Fig. 3b is indeed very much not to scale! 

13 Perhaps the most practically important example is the air/water system. For its detailed discussion, based on Eq. 
(19), the reader may be referred, e.g., to Sec. 3.9 in F. Schwabl, Statistical Mechanics, Springer (2000). Other 
important applications include metallic alloys - solid solutions of metal elements. 
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insertion (or extraction) the finite latent heat A. Relations (14) and (17) show that the latent heat is 
directly related to the finite difference between entropies and volumes of the two phases (at the same 
pressure). As we know from Chapter 1, both S and V may be presented as first derivatives of appropriate 
thermodynamic potentials. This is why such transitions, involving a jump of potentials 'first derivatives, 
are called first-order phase transitions. 

On the other hand, there are phase transitions that have zero latent heat (A = 0) and no first 
derivative jumps at the transition temperature T c , so that the temperature point is clearly marked, for 
example, by a jump of a second derivative of a thermodynamic potential - for example, the derivative 
DC/dT which, according to Eq. (1.24), equals to d 2 E/dl 2 . In the initial classification by P. Ehrenfest, this 
was an example of a second-order phase transition. However, most features of such phase transitions are 
also pertinent to some systems in which the second derivatives of potentials are continuous as well. Due 
to this reason, I will use a more recent terminology (suggested by M. Fisher), in which all phase 
transitions with A = 0 are called continuous. 

Most continuous phase transitions result from particle interactions. Here are some examples: 

(i) At temperatures above ~ 120°C, the crystal lattice of barium titanate (BaTiO,) is cubic, with a 
Ba ion in the center of each Ti-comered cube (or vice versa) - see Fig. 4a. However, as temperature is 
being lowered below that critical value, the sublattice of Ba ions starts moving along one of 6 sides of 
the TiCE sublattice, leading to a small deformation of both lattices - which become tetragonal. This is a 
typical example of a structural transition, in this particular case combined with a ferroelectric 
transition, because (due to the positive electric charge of Ba ions) below the critical temperature the 
BaTiCE crystal acquires a spontaneous electric polarization. 


P- 


P 


(a) 


(b) 


.* jip'-.o 7 


Zn 


Cu 




Fig. 4.4. Cubic lattices of 
(a) BaTiCfi and (b) CuZn. 


(ii) A different kind of phase transition happens, for example, in Cu^Zn i alloys {brasses). Their 
crystal lattice is always cubic, but above certain critical temperature T c (which depends on x) any of its 
nodes is occupied by either a copper or a zink atom, at random. At T < T c , a trend towards atom 
alternation arises, and at low temperatures, the atoms are fully ordered, as shown in Fig. 4b for the 
stoichiometric casex = 0.5. This is a good example of an order-disorder transition. 

(iii) At ferromagnetic transitions (happening, e.g., in Fe at 1,388 K) and antiferromagnetic 
transitions (e.g., in MnO at 116 K), lowering of temperature below the critical value 14 does not change 
atom positions substantially, but results in a partial ordering of atomic spins, eventually leading to their 
full ordering (Fig. 5). 


14 For ferromagnets, this point is usually referred to at the Curie temperature, and for antiferromagnets, as the 
Neel temperature. 
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Heisenberg 

model 


(iv) Finally, the Bose-Einstein condensation of atoms in liquid helium and electrons in 
superconducting metals and metal oxides may be also considered as continuous phase transitions. At the 
first glance, this contradicts to the nonvanishing latent heat given by the BEC theory outlined in Sec. 
3.4. However, that theory shows that A — > 0 at T — > 0 and hence P(T) — > 0 - see Eq. (3.79). Hence, at 
zero pressure the Bose Einstein condensation of an ideal gas could may be considered a continuous 
phase transition. For a gas, this is just not a very interesting limit, because of the vanishing gas density. 
On the contrary, the Bose-Einstein condensation of strongly interacting particles in liquids or solids is 
not affected by pressure - at least on the ambient pressure scale, and taking P = 0 is quite a legitimate 
assumption. 15 




Fig. 4.5. Classical images of 
completely ordered phases: 
(a) a ferromagnet, and (b) 
an antiferromagnet. 


Besides these standard examples, some other threshold phenomena, such as formation of a 
coherent optical field in a laser, and even the self-excitation of oscillators with negative damping (see, 
e.g., CM Sec. 4.4), may be treated, at certain conditions, as continuous phase transitions. 16 


The general feature of all these transitions is the gradual formation, at T < T c , of certain ordering, 
which may be characterized by some order parameter p 0. The simplest example of such order 
parameter is the magnetization at the ferromagnetic transitions, and this is why the continuous phase 
transitions are usually discussed on certain models of ferromagnetism. (I will follow this tradition, while 
mentioning in passing other important cases that require a substantial modification of theory.) Most of 
such models are defined on an infinite 3D cubic lattice (see, e.g., Fig. 5), with evident generalizations to 
lower dimensions. For example, the Heisenberg model of a ferromagnet is defined by the following 
Hamiltonian: 


H = -j'Y J o j -G r -'Y J \i-G j , with h = ju b 3 , 

M L 


(4.21) 


15 As follows from the discussion of Eqs. (1.1)-(1.3), for ferroelectric transitions between phases with different 
electric polarization, the role of pressure is played by the external electric field A while for the ferromagnetic 
transitions between phases with different magnetization, by the external magnetic field W. As we will see very 
soon, such fields give such a phase transition a nonvanishing latent heat, making it the first order transition. 

16 Unfortunately, I will have no time for these interesting (and practically important) generalizations, and have to 
refer the interested reader to the famous monograph by R. Stratonovich, Topics in the Theory of Random Noise, in 
2 vols., Gordon and Breach, 1963 and 1967, and/or the influential review by H. Haken, Ferstkorperprobleme 10 , 
351 (1970). 
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where a . is the Pauli matrix operator 17 acting on y'-th spin, n^is the direction of magnetic field and 
constant //b is the Bohr magneton 

ju B = — * 0.927 xl(T 23 J/T, (4.22) 

2 m e 


with (- e ) and being electron’s charge and mass. The figure brackets {j, j ’} in Eq. (21) denote the 
summation over the pairs of adjacent sites, so that the magnitude of constant / may be interpreted as the 
maximum coupling energy per “bond” between two adjacent particles. At J > 0, the coupling tries to 
keep spins aligned (thus minimizing the coupling energy), i.e. to implement the ferromagnetic 
ordering. 18 The second tenn in Eq. (21) describes the effect of external magnetic field which tries to 
turn all spins, with their magnetic moments, along its direction. 


However, even the Heisenberg model, while being approximate, is still too complex for analysis. 
This is why most theoretical results have been obtained for its classical twin, the Ising model : 19 


E m =~ J Yj s j s r~ h Yj s j 

Lhf] J 


(4.23) 


Here E m are eigenvalues of energy in the magnetic field, constant h mimics an external magnetic 
field, and sj are classical scalar variables that may take only two values, Sj = ±1. (Despite its classical 
character, variable Sj modeling the real spin of an electron, is usually called “spin” for brevity, and I will 
follow this tradition.) Index m numbers all possible combinations of variables sj - there are 2 v of them in 
a system of N Ising “spins”. Somewhat shockingly, even for this toy model, no analytical 3D solutions 
have been found, and the solution of its 2D version by L. Onsager in 1944 (see Sec. 5 below) is still 
considered one of the top intellectual achievements of the statistical physics. Still, Eq. (23) is very useful 
for the introduction of basic notions of continuous phase transitions, and methods of their analysis, and I 
will focus my brief discussion on this model. 20 


Evidently, if T = 0 and h = 0, the lowest value of internal energy, 


E min = ~ JNd , 


(4.24) 


where d is the lattice dimensionality, is achieved in the “ferromagnetic” phase in which all spins s,- are 
equal to either + 1 or -1 simultaneously, so that the lattice average (sj) = ±1. On the other hand, at J = 0 
and h = 0, the spins are independent, and in the absence of external field their signs are completely 
random, with the 50% probability to have either of values ±1, so that (sj) = 0. Hence in the case of 
arbitrary parameters we may use the average 


17 See, e.g., QM Sec. 4.4. 

18 At J < 0, the first term of Eq. (21) gives a reasonable model of an antiferromagnet, but in this case the external 
magnetic field effects are more subtle, so I will not have time to discuss it. 

19 Named after E. Ising who explored the ID version of the model in detail in 1925, though a similar model was 
discussed earlier (in 1920) by W. Lenz. 

20 For a more detailed discussion of phase transition theory (including other popular models of the ferromagnetic 
phase transition, e.g., the Potts model), see, e.g., either H. Stanley, Introduction to Phase Transitions and Critical 
Phenomena, Oxford U. Press, 1971; or A. Patashinskii and V. Pokrovskii, Fluctuation Theory of Phase 
Transitions, Pergamon, 1979; or B. McCoy, Advanced Statistical Mechanics, Oxford U. Press, 2010. For a much 
more concise text, I can recommend J. Yeomans, Statistical Mechanics of Phase Transitions, Clarendon, 1992. 


Ising 

model 


Chapter 4 


Page 10 of 36 


Essential Graduate Physics 


SM: Statistical Mechanics 


Ising 

model’s 

order 

parameter 



(4.25) 


as a good measure of spin ordering, i.e. as the order parameter. Since in a real ferromagnet, each spin 
carries a magnetic moment, the order parameter r/ corresponds to the substance magnetization, at r/h > 
0, directed along the applied magnetic field. 21 


Due to the difficulty of calculating the order parameter for arbitrary temperatures, most 
theoretical discussions of continuous phased transitions are focused on its temperature dependence just 
below T c . Both experiment and theory show that (in the absence of external field) this dependence is 
close to a certain power, 


jj oc r 


P 


for r > 0 , 


(4.26) 


of the small deviation from the critical temperature, which is conveniently normalized as 


T. -T 


(4.27) 


Remarkably, most other key variables follow a similar temperature behavior, with the same critical 
exponent for both signs of r. In particular, the heat capacity at fixed magnetic field behaves as 22 


C 


h 


1 

oc — 
r 


Similarly, the (normalized) low-field susceptibility 23 


dr/ 

~dh 


h = 0 


oc 


1 



(4.28) 


(4.29) 


Two more important critical exponents, f and v, describe temperature behavior of the 
correlation function {sjSj) whose dependence on distance rp between two spins may be well fitted by the 
following law, 


with the correlation radius 



oc 


d-2+c 

r Jf 




(4.30) 


r, oc • 


(4.31) 


Finally, three more critical exponents, usually denoted s, 8, and p, describe the external field 
dependences of, respectively, c, r/ and r c at r = 0. For example, 8 is defined as 


21 See, e.g., EM Secs. 5. 4-5. 5. 

22 The form of all temperature functions is selected so that all critical exponents are non-negative. 

23 This variable models the real physical magnetic susceptibility % m of magnetic materials - see, e.g., EM Eq. 
(5.111). 
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r] oc h^ s . (4.32) 

(Other field exponents are used less frequently, and for their discussion I have to refer the interested 
reader to the special literature listed above.) 

The second column of Table 1 shows experimental values of the critical exponents for various 
3D physical systems featuring continuous phase transitions. One can see that their values vary from 
system to system, leaving no hope for a universal theory that would describe them all. However, certain 
combination of the exponents are much more reproducible - see the bottom lines of the table. 


Table 4.1. Major critical exponents of continuous phase transitions 


Exponents and 
combinations 

Experimental 
range (3D) (a) 

Mean-field 

theory 

2D Ising 
model 

3D Ising 
model 

3D Heisenberg 
Model (d) 

a 

0-0.14 

0 (b) 

(C) 

0.12 

-0.14 

P 

0.32-0.39 

1/2 

1/8 

0.31 

0.3 

r 

1.3- 1.4 

1 

7/4 

1.25 

1.4 

8 

4-5 

3 

15 

5 

- 

V 

0.6 -0.7 

1/2 

1 

0.64 

0.7 

e 

0.05 

0 

1/4 

0.05 

0.04 

(a + 2/3+ y)/2 

1.00 ±0.005 

1 

1 

1 

1 

8- yip 

0.93 ±0.08 

1 

1 

1 

? 

(2 - Qv/y 

1.02 ±0.05 

1 

1 

1 

1 

(2 - a )/ vd 

(e) 

4 Id 

1 

1 

1 


(a) 

Experimental data are from the monograph by A. Patashinskii and V. Pokrovskii, cited above. 

(b) Discontinuity at r= 0 — see below. 

(c) Instead of following Eq. (28), in this case C h diverges as ln| r|- 

(d) With the order parameter r/ defined as (ay n®). 

(c) I could not find any data on this. 


Historically the first (and perhaps the most fundamental) of these universal relations was derived 
in 1963 by J. Essatn and M. Fisher: 

cc + 2p + y = 2. (4.33) 

It may be proved, for example, by finding the temperature dependence such magnetic field value, h T , 
which changes the order parameter by the amount similar to that already existing at h = 0, due to a finite 
temperature deviation r > 0. First, we may compare Eqs. (26) and (29), to get 

h T oc . (4.34) 
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By the physical sense of h T we may expect that such field has to affect system’s free energy 24 F(T, h ) by 
the amount comparable to the effect of a bare temperature change r. Ensemble-averaging the last term of 
Eq. (23) and using the definition (25) of the order parameter 77 , we see that the change of F (per particle) 
due to the field equals -h T rj and, according to Eq. (26), scales as h T r^ x p /3+ r \ 


In order to estimate the thermal effect on F, let us first derive one more useful general 
thennodynamic formula . 25 Dividing Eq. (1.19) by dT, we may present heat capacity of a system as 


C 


X 



ds\ 



(4.35) 


where X is the variable maintained constant at the temperature variation. For example, in the standard 
“P-V” thermodynamics, we may use the first of Eqs. (1.35) to recast Eq. (35) for X = V as 


C v =T 


8S_ 

\dT Jy 


= -T 


d~F 

~dT Y 


Jv 


(4.36) 


while for X = P it may be combined with Eq. (1.39) to get 


' dS ) 

= -T 

{ a2G l 

y 8Tj 

p 



(4.37) 


As was just discussed, in the Ising model the role of pressure P is played by the external 
magnetic field h, and of G by F, so that the last form of Eq. (37) means that the thennal part of F may be 
found by double integration of (-C/,/7) over temperature. In the context of our current discussion, this 
means that near T c , the free energy scales as the double integral of C/ t oc r ~ a over r. In the limit r« 1, 
factor T may be treated as a constant; as a result, the change of F due to r > 0 alone scales as l 1 ' a \ 
Requiring this change to be proportional to the same power of r as the field-induced part of energy, we 
get the Essam-Fisher relation (33). 

Using similar reasoning, it is straightforward to derive a few other universal relations of critical 
exponents, including the Widom relation. 


24 There is some duality of terminology (and notation) in literature on this topic. Indeed, in the Ising model (as in 
the Heisenberg model), the magnetic field effects are usually accounted at the microscopic level, by the inclusion 
of the corresponding term into each particular value of energy E m . Then, as was discussed in Sec. 1.4, system’s 
equilibrium (at fixed external field h, and also T and N) corresponds to the minimum of the Helmholtz free energy 
F. From this point of view, these problems do not feature either pressure or volume, hence we may take PV - 
const, so that both thermodynamic potentials effectively coincide: G = F + PV = F + const. On the other hand, it 
is fair to say that the role of the magnetic field in these problems is very similar to that of pressure (or rather of - 
P) in the “usual” thermodynamics. Due to this analogy, and taking into account that the equilibrium of a system at 
fixed P corresponds to the minimum of the Gibbs free energy G, in some publications this name is used for the 
minimized potential. Still, on the microscopic level, there is a difference in the descriptions of field and pressure - 
see the footnote in the end of Sec. 2.4. Due to this reason, I will follow the traditional, first point of view in most 
of my narrative, but will use the replacements F — > G and h — > -P to use thermodynamic formulas (1.39) and (37) 
when convenient. 

25 Admittedly, it belongs to Chapter 1 , but I was reluctant to derive it there to avoid a narrative interruption. 
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= ( 4 -38) 

very similar relations for other high-field exponents a and p (which I do not have time to discuss), and 
the Fisher relation 

v{2-C)=y. (4.39) 

A slightly more complex reasoning, involving the so-called scaling hypothesis, yields the 
dimensionality-dependent Josephson relation 

vd = 2-a. (4.40) 

Table 1 shows that at least three of these relations are in a very reasonable agreement with 
experiment, so that we will use them as a testbed for various theoretical approaches to continuous phase 
transitions. 


4,3. Landau’s mean-field theory 

The most general approach to analysis of the continuous phase transitions, formally not based on 
any particular model (though in fact implying the Ising model (23) or one of it siblings), is the mean- 
field theory developed in 1937 by L. Landau, on the basis of prior ideas by P.-E. Weiss - to be discussed 
in the next section. The main approximation of this phenomenological approach is to present the free 
energy change A F at the phase transition as an explicit function of the order parameter // (25). Generally 
this function may be complicated and model-specific, but near T c , rj has to tend to zero, so that the 
change of the relevant thennodynamic potential, the free energy, 

A F = F{T)-F{T c ), (4.41) 

may be expanded into the Taylor series in //, and only a few, most important first terms of that 
expansion retained. In order to keep the symmetry between two possible signs of the order parameter in 
the absence of external field, at h = 0 this expansion should not include odd powers of 77 : 

A T7 1 

—\,.„=AT)r + -50V + .... (4.42) 

As we will see imminently, these two terms are sufficient to describe finite (non-vanishing but limited) 
stationary values of the order parameter; this is why Landau’s theory ignores the higher terms of the 
Taylor expansion - which are much smaller at 77 — » 0. 

Now let us discuss temperature dependences of coefficients A and B. The equilibrium of the 
system should correspond to minimum of F. Equation (42) shows that, first of all, coefficient B(T) has to 
be positive for any sign of r , to ensure the equilibrium at a finite value of rf . Thus, it is reasonable to 
ignore the temperature dependence of B near the critical temperature altogether and use approximation 

B(T) = b> 0. (4.43) 

On the other hand, as Fig. 6 shows, coefficient A{T) has to change sign at T = T c , being positive at T > 
T c and negative at T < T c , to ensure the transition from // = 0 at T > T c to a certain non-vanishing value at 
T < T c . Since A should be a smooth function of temperature, we may approximate it by the leading term 
in its Taylor expansion in r : 
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Free 
energy in 
Landau 
theory 


so that Eq. (42) becomes 


A(T ) = -a r, with a > 0 , 


A F 
V 


h=0 =-aTTj 2 +^b?i 4 . 


(4.44) 


(4.45) 



Fig. 4.6. Free energy (42) as a 
function of (a) r/ and (b) rf in 
Landau’s mean-field theory, 
for two different signs of 
coefficient A ( r). 


The main strength of Landau’s theory is the possibility of its straightforward extension to the 
effects of the external field and of spatial variations of the order parameter. First, averaging of the field 
tenn of Eq. (23) over all sites of the system, with the account of Eq. (25), gives an energy addition of - 
hrj per particle, i.e. - nhij per unit volume, where n is the particle density. Second, since (according to 
Eq. (23) with v> 0, see Table 1) the correlation radius diverges at r — > 0, spatial variations of the order 
parameter should be slow, I V 77 1 — > 0. Hence, the effects of the gradient on A F may be approximated by 
the first nonvanishing tenn of its expansion into the Taylor series in (V 77 ) . As a result, Eq. (45) may be 
generalized as 



(4.46) 


where c is a factor independent of 77 . In order to avoid the unphysical effect of spontaneous formation of 
spatial variations of the order parameter, that factor has to be positive at all temperatures, and hence may 
be taken for constant in a small vicinity of T c - the only region where Eq. (46) may be expected to 
provide quantitatively correct results. 


Relation (46) is the full version of the free energy in Landau’s theory . 26 Now let us find out what 
critical exponents are predicted by this phenomenological approach. First of all, we may find 
equilibrium values of the order parameter from the condition of F having a minimum, dF/drj = 0. At h = 
0, it is easier to use the equivalent equation dF/d(rf ) = 0, where F is given by Eq. (45) - see Fig. 6 b. 
This immediately yields 


_j(ar / bj ' 2 , forr> 0 , 
[ 0 , forr< 0 . 


(4.47) 


26 Historically, the last term belongs to the later (1950) extension of the theory by V. Ginzburg and L. Landau - 
see below. 
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Comparing this result with Eq. (26), we see that in the Landau theory, /?= A Next, plugging result (47) 
back into Eq. (45), for the equilibrium (minimal) value of the free energy, we get 


\-a 2 z 2 12b, forr>0, 
[ 0, for r < 0. 


From here and Eq. (36), the specific heat, 

Cu 

V 


\a 2 /bT c , 

lo. 


for r > 0, 
for r < 0, 


(4.48) 


(4.49) 


has, at the critical point, a discontinuity rather than a singularity, i.e. the critical exponent a = 0. 

In the presence of a uniform field, the equilibrium order parameter should be found from the 
condition df/dr/= 0 applied to Eq. (46) with V/; = 0, giving 


dr/ 


= -2a zr/ + 2br/ i - nh = 0 . 


In the limit of small order parameter, 77 — > 0, tenn with rf is negligible, and Eq. (50) gives 

nh 


7 = 


2a x 


(4.50) 


(4.51) 


so that according to Eq. (29), y = 1 . On the other hand, at r = 0 (or at relatively high fields at other 
temperatures), the cubic term in Eq. (50) is much larger than the linear one, and this equation yields 


7 


nh 

U b) 


(4.52) 


so that comparison with Eq. (32) yields 8 = 3. 

2 2 

Finally, according to Eq. (30), the last tenn in Eq. (46) scales as cr) tr c . (If r c ^ 00, the effects of 
the pre-exponential factor in that equation are negligible.) As a result, the gradient term contribution is 
comparable 27 with the two leading tenns in A f (which, according to Eq. (47), are of the same order), if 

N 1/2 

(4.53) 

so that according to definition (3 1) of the critical exponent v, it is equal to A 

The third column in Table 1 summarizes the critical exponents and their combinations in 
Landau’s theory. It shows that these values are somewhat out of the experimental ranges, and while 
some of their universal relations are correct, some are not; for example, the Josephson relation would be 
only correct at d = 4 (not the most realistic spatial dimensionality :-) The main reason for this 



27 According to Eq. (30), the correlation radius may be interpreted as the length distance at which the order 
parameter 77 relaxes to its equilibrium value, if it is deflected from it at some point. Since the law of such spatial 
change may be obtained by a variational differentiation of F, for the actual relaxation law, all major terms of (46) 
have to be comparable. 
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disappointing result is that describing the spin interaction with the field, the Landau mean-field theory 
neglects spin randomness, i.e. fluctuations. Though a quantitative theory of thennodynamic fluctuations 
will not be discussed until the next chapter, we can readily perfonn their crude estimate. Looking at Eq. 
(46), we see that its first term is a quadratic function of the effective “half-degree of freedom”, 77. Hence 
in accordance with the equipartition theorem (2.28) we may expect that the average square of its thennal 
fluctuations, within a ^-dimensional volume with linear size ~r c , should be of the order of 772 (close to 
the critical temperature, TJ2 is a good approximation): 


a m 2 ) r c 


(4.54) 


In order to be negligible, the variance has to be negligible in comparison with the average rf ~ ar/b. 
Plugging in the r - dependences of the operands of this relation, and values of the critical exponents in 
the Landau theory, for r> 0 we get the so-called Levanyuk-Ginzburg criterion of its validity: 


2a r 


r ar^ 


V c J 


2 a 

« —T . 

b 


(4.55) 


We see that for any realistic dimensionality, d < 4, at r — > 0 the order parameter fluctuations grow faster 
than the its average value, and hence the theory becomes invalid. 

Thus the Landau mean-field theory is not a perfect approach to finding critical indices at 
continuous phase transitions in Ising-type systems with their next-neighbor interactions between the 
particles. Despite of that fact, this theory is very much valued because of the following reason. Any 
long-range interactions between particles increase the correlation radius r c , and hence suppress the order 
parameter fluctuations. For an example, at laser self-excitation, the emerging coherent optical field 
couples all photon-emitting particles in the electromagnetic “cavity” (resonator). As another example, in 
superconductors the role of the correlation radius is played by the Cooper-pair size go, which is typically 
of the order of 10' m, i.e. much larger than the average distance between the pairs (--10' m). As a 
result, the mean-field theory remains valid at all temperatures besides an extremely small temperature 
interval near T c - for bulk superconductors, of the order of 10‘ 6 K. 

Another strength of Landau’s classical mean-field theory is that it may be readily generalized for 
description of Bose-Einstein condensates, i.e. quantum fluids. Of those generalizations, the most famous 
is the Ginzburg-Landau theory of superconductivity developed in 1950, i.e. even before the 
“microscopic” explanation of this phenomenon by Bardeen, Cooper and Schrieffer in the 1956-57. In 
the Ginzburg-Landau theory, the real order parameter 77 is replaced with the modulus of a complex 
function y/, physically the wavefunction of the coherent Bose-Einstein condensate of Cooper pairs. 
Since each pair carry electric charge q = -2e, 28 and has zero spin, it interacts with magnetic field in a 
way different from that described by the Heisenberg or Ising models. Namely, as was already discussed 
in Sec. 3.4, the del operator V in Eq. (46) has to be complemented by term —i(q/h) A, where A is the 
vector-potential of the total magnetic field 3 = VxA, including not only the external magnetic field /¥, 


28 In the phenomenological Ginzburg-Landau theory, charge q remains unspecified, though the wording in their 
original paper clearly shows that the authors correctly anticipated that this charge might turn out to be different 
from the single electron charge. 
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but also the field induced by the supercurrent itself. With the account for the well-known fonnula for the 
magnetic field energy in the external field, 29 Eq. (46) is now replaced with 


(4.56) 


where m is a phenomenological coefficient rather than the actual particle mass. The variational 
minimization of the resulting A F over variables y/ and B (which is suggested for reader’s exercise 30 ) 
yields two differential equations: 


(4.57) 

(4.58) 



The first of these Ginzburg-Landau equations should be no big surprise for the reader, because 
according to the Maxwell equations, in magnetostatics the left-hand part of Eq. (57) has to be equal to 
the electric current density, while the right-hand part is the usual quantum-mechanical probability 
current density multiplied by q, i.e. the electric current (or rather supercurrent ) density j s of the Cooper 
pair condensate. (Indeed, after plugging yr = n exp {itp} into that expression, we come back to Eq. 
(3.84) which, as we already know, explains such macroscopic quantum phenomena as magnetic flux 
quantization and Meissner-Ochsenfeld effect.) 


However, Eq. (58) is new - for this course. Since last term in its right-hand part is the standard 
wave-mechanics expression for the kinetic energy of a particle in the presence of magnetic field, 31 if this 
tenn dominates that part of the equation, Eq. (58) is reduced to the stationary Schrodinger equation, 
Ey/ = II y / , for the ground state of confinement- free Cooper pairs, with energy E = ar. However, in 
contrast to the usual (single-particle) Schrodinger equation, in which | yJ\ is determined by the 
normalization condition, the Cooper pair condensate density n = y/\ is determined by the 
thermodynamic balance of the condensate with the ensemble of “normal” (unpaired) electrons that play 
the role of the uncondensed part of Bose gas, discussed in Sec. 3.4. In Eq. (58), such balance is enforced 
by the first term b\ yJ\ y/ of the right-hand part. 32 As we have already seen, in the absence of magnetic 
field and spatial gradients, such term yields | y/\ oc (T c - T) “ - see Eq. (47). 


29 See, e.g., EM Eq. (5.129). 

30 As a useful elementary sanity check, the minimization of A/ in the absence of a superconductor, i.e. without the 
first 3 terms in the right-hand part of Eq. (56), immediately gives the correct result B = // 0 H. 

31 See, e.g., QM Sec. 3.1. 

32 From the mathematics standpoint, such term, nonlinear in \yj\, makes Eq. (58) a member of the family of 
“nonlinear Schrodinger equations”. Another important member of this family is the Gross-Pitaevskii equation, 

2 fj~ 

ary/ = b\y/\ y / — — V * 2 y/ + U(r)y/ , 

2m 

which gives a very reasonable (albeit phenomenological and hence approximate) description of Bose-Einstein 
condensates of neutral atoms at T « T c . The differences between the Ginzburg-Landau and Gross-Pitaevskii 
equations reflect, first, the zero charge q of the neutral atoms and, second, the fact that the atoms forming the 


Free 

energy in 
Ginzburg- 
Landau 
theory 


Ginzburg- 

Landau 

equations 
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It is easy to see that as either the external magnetic field or the current density in a 
superconductor are increased, so is the last term in Eq. (58). This increase has to be matched by a 
corresponding decrease of | if , i.e. of the condensate density n, until it is completely suppressed. This 
explains the well documented effect of superconductivity suppression by magnetic field and 
supercurrent. Moreover, together with the flux quantization discussed in Sec. 3.4, it explains the 
existence of the so-called Abrikosov vortices - thin tubes of magnetic field, each carrying one quantum 
®o of magnetic flux - see Eq. (3.86). At the core part of the vortex, | \f\ is suppressed (down to zero at 
its central line) by the persistent supercurrent, which circulates around the core and screens the rest of 
superconductor from the magnetic field carried by the vortex. The penetration of such vortices into the 
so-called type-II superconductors 33 enables them to sustain vanishing electric resistance up to very high 
magnetic fields of the order of 20 T, and to be used in very compact magnets - including those used for 
beam bending in particle accelerators. 

Moreover, generalizing Eq. (58) to the time-dependent case, just as it is done with the usual 
Schrodinger equation (E — > ihd/dt), one can describe other fascinating quantum macroscopic phenomena 
such as the Josephson effects, including the generation of oscillations with frequency co j = iq/h) V by 
tunnel junctions between two superconductors, biased by dc voltage K Unfortunately, time/space 
restrictions do not allow me to discuss these effects in any detail here, and I have to refer the reader to 
special literature. 34 Let me only note that at T « T c , and not extremely pure superconductors (in which 
the so-called non-local transport phenomena may be important), the Ginzburg-Landau equations are 
exact, and may be derived (and their parameters T c , a, b, q, and m detennined) from the “microscopic” 
theory of superconductivity based on the initial work by Bardeen, Cooper and Schrieffer. 35 Most 
importantly, such derivation proves that q = -2e - the electric charge of a singe Cooper pair. 


4,4, Ising model: The Weiss’ molecular-field theory 

The Landau mean-field theory is phenomenological in the sense that even within the range of its 
validity, it tells us nothing about the value of the critical temperature T c and other parameters (in Eq. 
(46), a, b, and c), so that they have to be found from a particular “microscopic” model of the system 
under analysis. In this course, we would have time to discuss only the Ising model (23) for various 
dimensionalities d. 

The most simplistic way to map the model on a mean-field theory is to assume that all spins are 
exactly equal, Sj = //, with an additional condition rj < \, forgetting for a minute that in the genuine 
Ising model, Sj may equal only +1 or -1. Plugging this relation into Eq. (23), we get 36 


condensates may be readily placed in external potentials U(r ) ^ const (e.g., those trapping the atoms), while in 
superconductors such potential profiles are much harder to create due to the screening of electric field by metals - 
see, e.g., EM Sec. 2.1. 

33 Such penetration had been discovered experimentally by L. Shubnikov in the mid- 1930s, but its quantitative 
explanation had to wait until A. Abrikosov’s work (based on the Ginzburg-Landau equations) published in 1957. 

34 See, e.g., M. Tinkham, Introduction to Superconductivity, 2 nd ed., McGraw-Hill, 1996. A short discussion of 
the Josephson effects may be found in QM Sec. 2.3 and EM Sec. 6.4. 

35 See, e.g., Sec. 45 in E. Lifshitz and L. Pitaevskii, Statistical Physics, Part 2, Pergamon, 1980. 

36 Since in this naive approach we neglect the thermal fluctuations of spin, i.e. their disorder, this assumption 
implies S = 0, so that F = E - TS = E, and we may use either notation for system’s energy. 
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F = -(j NJd )rj 2 -Nhrj . (4.59) 

This energy is plotted in Fig. 7a as a function of 77, for several values of h. The plots show that at 
h = 0, the system may be in either of two stable states, with rj = ±1, corresponding to two different 
directions of spins (magnetization), with equal energy. 37 (Formally, the state with 77 = 0 is also 

stationary, because at this point dF/drj= 0, but it is unstable, because for the ferromagnetic interaction, J 

2 2 

> 0, the second derivative d Fldvf is positive.) 
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Fig. 4.7. Field dependence 
of (a) the free energy profile 
and (b) order parameter (i.e. 
magnetization) in the 
crudest mean-field approach 
to the Ising model. 


As the external field is increased, it tilts the potential profile, and finally at a critical field, 

h c = 2 Jd , (4.60) 

the state with 77 = -1 becomes unstable, leading to system’s jump into the only remaining state with 
opposite magnetization, 77 = +1. Application of the similar external field of the opposite polarity leads to 
the similar switching back to 77 = -1, so that the full field dependence of 77 follows the hysteretic pattern 
shown in Fig. 7b. Such a pattern is the most visible experimental feature of actual ferromagnetic 
materials, with the coercitive magnetic field /f (modeled with h c ) of the order of I O ' A/m, and the 

saturated magnetization (modeled with 77 = ±1) corresponding to much higher fields B - of the order of 
a few tesla. The most important property of these materials, also called permanent magnets, is their 
stability, i.e. the ability to retain the history-determined direction of magnetization in the absence of 
external field, for a very long time. In particular, this property is the basis of all magnetic systems for 
data recording, including the ubiquitous hard disk drives with their incredible information density - 
currently approaching 1 Terabit per square inch. 38 

So, this simplest mean-field theory gives a crude description of the ferromagnetic ordering, but 
grossly overestimates the stability of these states with respect to thermal fluctuations. Indeed, in this 


37 The fact that stable states always correspond to 77 = ±1, partly justifies the treatment of 77 as a continuous 
variable in this crude approximation. 

38 For me, it was always surprising how little physics students knew about this fascinating field of modern 
engineering, which involves so much interesting physics and fantastic electromechanical technology. For getting 
acquainted with it, I may recommend, for example, the monograph by C. Mee and E. Daniel, Magnetic Recording 
Technology, 2 nd ed., McGraw-Hill, 1996. 
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theory, there is no thermally-induced randomness at all, until T becomes comparable with the height of 
the energy barrier separating two stable states, 

AF = F(rj = 0) - F(tj = ±1) = NJd , (4.61) 


which is proportional to the number of particles. At N — » oo, this value diverges, and in this sense the 
critical temperature is infinite, while numerical experiments and more refined theories of the Ising 
model show that actually the ferromagnetic phase is suppressed at T c ~Jd- see below. 39 

The mean-field approach may be dramatically improved by even an approximate account for 
thermally-induced randomness. In this approach, suggested in 1908 by P.-E. Weiss under the name of 
molecular-field theory random deviations of individual spin values from the lattice average, 

Sj = Sj - 7J, tj = (sj ), (4.62) 

are allowed, but considered small, | « 77 . This assumption allows us, after plugging expression 
s j =tj + s', into the first term of the right-hand part of Eq. (23), 

E m =-J ^(7 + Sj \r] + s r ) - fi£ Sj , (4.63) 

t 

ignore the term proportional to 's s'j, . Making replacement (62) in the terms proportional to 5 . , we get 

E, -> EJ = (NJd)f . (4-64) 

j 

where /z e f is defined as the sum 

h e{ =h + ( 2Jd)rj . (4.65) 


The physical interpretation of h e f is the effective external field, which (besides the real external 
field h ) takes into account the effect that would be exerted on spin Sj by its 2d next neighbors, if they all 
had unperturbed (but possibly fractional) spins Sj-= 77 . Such an addition to external field, 


Weiss 

molecular 

field 


^mol = /? ef - h = {2Jd )ll , 


is called the molecular field - giving its name to the theory. 


(4.66) 


From the point of view of statistical physics, at fixed parameters of the system (including the 
order parameter rj), the first tenn in the right-hand part of Eq. (64) is merely a constant energy offset, 
and h e f is just another constant, so that 


Ef = const + ^ Ej, Ej 

j 


-KfSj = 



for Sj = +1, 
for Sj = -1. 


(4.67) 


39 Actually, the thermal stability of many real ferromagnets, with longer-range interaction between spins, is higher 
than that predicted by the Ising model. 

40 In some texts, this approximation is called the mean-field theory. This terminology may lead to confusion, 
because the molecular- field theory is on a completely different level of phenomenology than, say, Landau’s 
mean-field theory. For example, the molecular-field approach may used for the calculation of parameters a, b, and 
T c participating Eq. (46) - the starting point of Landau’s theory. 
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Such separability of energies means that in the Weiss approximation the spin fluctuations are 
independent, and their statistics may be examined individually, using energy spectrum Ej. But this is 
exactly the two-level system which was the subject of three exercise problems in Chapter 2. Actually, its 
statistics is so simple that it is easier to redo this fundamental problem starting from scratch, rather than 
to use the results of those exercises (which would require changing notation). Indeed, according to the 
Gibbs distribution (2.58)-(2.59), the equilibrium probabilities of states sj = ±1 may be found as 

W ± =^e ±hef ' T , Z = exp j+ + exp j- = 2 cosh . (4.68) 


From here, we may readily calculate F = -TlnZ and other thermodynamic variables, but let us 
immediately use Eq. ( 68 ) to calculate the statistical average of Sj, i.e. the order parameter: 


?7 = (s y ) = (+W + +(-l)r 


e 


+K t !T _ e -h ei IT 

2 cosh (h e{ IT) 


= tanh^. 
T 


(4.69) 


Now comes the main trick of the Weiss’ approach: plugging this result back into Eq. (65), we 
may write the condition of self-consistency of the molecular field theory: 

Self- 

(4.70) consistency 
equation 

This is a transcendent equation that evades an explicit analytical solution, but its properties may be 
readily understood by plotting its both parts as functions of their argument, so that the stationary state(s) 
of the system corresponds to the intersection point(s) of these plots. 

First of all, let us explore the field-free case ( h = 0), when h e f = h mo \ = 2dJr], so that Eq. (70) is 
reduced to 


h e f - h = 2 Jd tanh . 


r 

77 = tanh 

V 


2 Jd 
T 



(4.71) 


giving one of the patterns sketched in Fig. 8 , depending on the dimensionless parameter 2Jd!T. 



Fig. 4.8. Ferromagnetic phase transition in 
Weiss’ molecular-field theory: two sides of 
Eq. (71) plotted as functions of 77 for 3 
temperatures: above T c (red), below T c 
(blue) and equal to T c (green). 


If this parameter is small, the right-hand part of Eq. (71) grows slowly with 77 (red line in Fig. 8 ), 
and there is only one intersection point with the left-hand part plot, at 77 = 0. This means that the spin 
system features no spontaneous magnetization - the so-called paramagnetic phase. However, if 
parameter 2Jd!T exceeds 1, i.e. T is decreased below the following critical value, 
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T c =2 Jd, 


(4.72) 


the right-hand part grows, at small 77 , faster than the left-hand part, so that their plots intersect it in 3 
points: rj = 0 and 77 = ±tjq. It is almost evident that the former stationary point is unstable while two 
latter points are stable . 41 Thus, below T c the system is in the ferromagnetic phase, with one of two 
possible directions of spontaneous magnetization, so that the critical {Curie) temperature, given by Eq. 
(72), marks the transition between the paramagnetic and ferromagnetic phases. (Since the stable 
minimum value of energy G is a continuous function of temperature at T = T c , this is the continuous 
phase transition.) 

Now let us repeat the same graphics to examine how each of these phases responds to external 
magnetic field h 0. According to Eq. (70), the effect of h is just a shift of the straight line plot of its 
left-hand part - see Fig. 9. 



In the paramagnetic case (Fig. 9a) the resulting dependence h s JJi) is evidently continuous, but 
the coupling effect ( J > 0) makes it more steep than it would be without spin interaction. This effect 
may be characterized by the low-field susceptibility defined by Eq. (29). To calculate it, let us notice 
that for small h, and hence A e f, function tanh in Eq. (70) is approximately equal to argument, so that Eq. 
(70) becomes 

Kf- h =—hf ( 4 - 73 ) 


Solving this equation for h e f, and then using Eq. (72), we get 

h - ~ - 

ef “ 1-2 JdiT ~ \-TJT ' 


(4.74) 


Recalling Eq. ( 66 ), we can rewrite this result for the order parameter, 


h e{ ~ h _ h 
T T-T ’ 

c c 


(4.75) 


meaning that the low-field susceptibility 


41 This fact may be readily verified by using Eqs. (64) and ( 68 ) to calculate F. Now condition dF/drj\h=o = 0 
returns us to Eq. (71), and calculating the second derivative, for T <T C we get d 2 Fldrf > 0 at 77 = ±r/ 0 (indicating 
two stable minima of F), and d 2 Fldrf <0 at 77 = 0 (the unstable maximum of F). 
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dr/ 

~dh 


h = 0 


1 

t-t; 


for T > T c . 


(4.76) 


This is the famous Curie-Weiss law, which shows that the susceptibility diverges at the approach to the 
Curie temperature T c . 


In the ferromagnetic case, the graphic solution (Fig. 9b) of Eq. (70) gives a qualitatively different 
result. A field increase leads, depending on the spontaneous magnetization, either the further saturation 
of A mo i (with the order parameter r/ gradually approaching 1 ), or, if the initial >/ was negative, a jump to 
positive 77 at some critical (coercitive) field h c . In contrast with the crude mean-field approximation (59), 
at T > 0 the coercitive field is smaller than that given by Eq. (60), and the magnetization saturation is 
gradual, in a good (semi-qualitative) accordance with experiment. 


To summarize, the Weiss’ molecular-field theory gives a more realistic description of the 
ferromagnetic and paramagnetic phases in the Ising model, and a very simple prediction (72) of the 
temperature of the phase transition between them, for an arbitrary dimensionality d of the cubic lattice. 
It also allows finding all other parameters of the mean-field theory for that model - an easy exercise left 
for the reader. 


4,5. Ising model: Exact and numerical results 

In order to evaluate the main prediction (72) of the Weiss theory, let us now discuss the exact 
(analytical) and quasi-exact (numerical) results obtained for the Ising model, going from the lowest 
dimensionality d = 0 to its higher values. 

Zero dimensionality means that a spin has no nearest neighbors at all, so that the first term of Eq. 
(23) vanishes. Hence Eq. (64), with h e f = h, is exact, and so is its solution (69). Now we can repeat the 
calculations that have led us to Eq. (76), with J = 0, i.e. T c = 0, and reduce this result to the so-called 
Curie law. 


X = J. (4.77) 

It shows that T c = 0, i.e. the system is paramagnetic at any temperature. One may say that for this case 
the Weiss molecular-field theory is exact - or in some sense trivial, because it provides an exact, fully 
quantum-mechanical treatment of spin-Vi: particles at negligible interaction. Experimentally, the Curie 
law is approximately valid for many so-called paramagnetic materials, i.e. 3D systems with a weak 
interaction between particle spins. 

The case d = 1 is more complex, but has an exact analytical solution. Probably the simplest way 
to obtain it is to use the so-called transfer matrix approach. 42 For this, first of all, we may argue that 
properties of a ID system of N » 1 spins (say, put at equal distances on a straight line) should not 
change noticeably if we bend that line gently into a closed loop (Fig. 10), i.e. assume that spins si and ,v,v 
form one more pair of next neighbors, giving one more contribution, -Jsis^, to energy (23): 

E m = ~(js x s 2 + Js 2 s 2 + ... + Js N s x )-(hs l +hs 2 + ... + hs N \ (4.78) 


42 It was developed in 1941 by H. Kramers and G. Wannier. Note that the approach is very close to the one used 
in ID quantum mechanics - see, e.g., QM Sec. 2.5. 
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Let us regroup terms of this sum in the following way: 


E„ = - 


f h h 

— s l + Js,s 0 + — s. 


12 


V- 


h 


h 


— + Js 7 s, + — s, 

v 2 2 


+ ...+ 


h 


h 


-s N +Js N s, +-S 


(4.79) 


so that the group in each parentheses depends only on the state of two adjacent spins. The corresponding 
statistical sum, 


Z= V exp \ h — + J + h — }> exp<j h + J — — + h — l exp-j h — + J - - - 1 + A — l (4.80) 

i n.rp rji i TT 1 T 1 IT 1 IT T 1 IT 






S^+i 

j=l,2,...N 


2 T 


2 T 


2 T 


2 T 


2 T 


2 T 


has 2' v terms, each corresponding to a certain combination of signs of /V spins. Each operand of the 
product under the sum may take 4 values for 4 different combinations of its two arguments: 


J , T S J S J^ +h S j +l 


exp j h — + J 

2 T T 


2 T 


\ exp{(j + h)/T }, for s f = s J+1 = + 1, 
exp{(j -h)/T}, for s y = s j+l = -1, 


ex 


p{- J / T }, for .s' = - 


(4.81) 


2+1 ‘ 



Fig. 4.10. ID Ising system on a 
circular loop. 


These values do not depend on index y', 43 and may be presented as elements of the so-called 
transfer matrix 


"expifj + h)/T) exp {- J/T } N 

v exp{- J/T } exp{(j - h)/T } y ’ 


(4.82) 


and the whole statistical sum may be recast as a product: 


Z = 


Z M„ „ M„ „ ...M 

OiiJ J O 


= ±1 


S N S 1 


y=l,2,. ..TV 

According to the basic rule of matrix multiplication, this sum is just 


(4.83) 


43 This is of course a result of the “translational” (or rather rotational) symmetry of the system, i.e. its invariance 
to the index replacement y — > j +1 in all terms of energy H m (besides index N which should be replaced with 1). 
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Z = Tr(M^). 

Matrix algebra tells us that this trace may be presented just as 

Z = A* + A N 

where A± are the eigenvalues of the transfer matrix M, i.e. the roots of its characteristic equation, 

|exp{(,7 + h)/T} — A exp {-J/T} 


exp{- Ml) 
A straightforward calculation yields 


ex 


v{{j-ti)/T}-A 


= 0 . 


(4.84) 

(4.85) 

(4.86) 


f/1 

, h f 

j — r 

cosh — ± 

W 

T l 


T 


L 4J f 

1/2 

1 T jj 



(4.87) 


Now the last simplification comes from condition N » 1 - which we needed anyway, to make 
the loop model equivalent to an in infinite ID system. In this limit, even a small difference of exponents, 
Z+ > A., makes the second term in Eq. (85) negligible, so that we finally get 



. h 


cosh — h 

1 T ( 

T { 


.2 h 


L— P 

1/2 

1 T jj 



N 


(4.88) 


From here, we can find the free energy per particle 


F T 1 

— = — In— = -J-T\n 
N N Z 


, h 

cosh — + 
T 


sinh 2 — + expj - 


4 J 
T 


X 1/2 

J 


(4.89) 


and hence can calculate all variables of interest from thermodynamic relations. In particular, the 
equilibrium value of the order parameter may be found from the last of Eqs. (1.39), with the 
replacements discussed above: G — > F, P — » -/?, and hence V = (GGIGP)r — » -(8F/dh)T = Nr/. For low 
fields (h « T), this fonnula yields 


rj = 



(4.90) 


This result describes linear magnetization with the following low-field susceptibility, 


drj 

~dh 


h = 0 



(4.91) 


and means that the ID Ising model does not exhibit a phase transition, i.e., T c = 0. However, its 
susceptibility grows, at T — > 0, much faster than the Curie law (77). This gives us a hint that at low 
temperatures the system is “virtually ferromagnetic”, with has the ferromagnetic order with some rare 
violations. (In physics, such violations are called low-temperature excitations .) This perception may be 
confirmed by the following approximate calculation. 

It is almost evident that the lo west-energy excitation of a ID ferromagnet at h = 0 is the reversal 
of signs of all spins in one of its parts (Fig. 11). Indeed, since such excitation (called the Bloch wall ) 
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involves the change of sign of just one product sjSj>, according to Eq. (78), its energy E w (defined as the 
difference between values of E m with and without the excitation) equals 2.7, regardless of the wall 
position. Since in a ferromagnet, parameter J is positive, Ew > 0. If the system tried to minimize its 
potential energy, having any wall in the system would be energy-disadvantageous. However, 
thennodynamics tells us that at finite T, system’s equilibrium corresponds to the minimum of free 
energy rather than just energy. 44 Hence, we have to calculate Bloch wall’s contribution Ew to the free 
energy. Since in a linear chain of N » 1 spins, the wall can take (N — 1) « N positions with the same 
energy Ew, we may claim that the entropy Sw associated with an excitation of this type is In A, and its 
according to definition (1.33) of the free energy, 

F w =E W -TS W « 2J-TlnN . (4.92) 



I 


Fig. 4.1 1. A Bloch wall in a ID Ising 
system. 


This result tells us that in the limit N — > oo, and at 7V 0, walls are always free-energy-beneficial, 
thus explaining the absence of the perfect ferromagnetic order in the ID Ising system. Note, however, 
that since the logarithm grows extremely slowly at large values of its argument, one may argue that a 
large but finite ID system would still feature a quasi-critical temperature 

(4.93) 

In At 


below which it would feature a virtually complete ferromagnetic order. (The exponentially large 
susceptibility (91) is a manifestation of this fact.) 

Now let us apply a similar approach to estimate T c of a 2D Ising model. Here the Bloch wall is a 
line of certain length L - see Fig. 12. (For this example, counting from the left to the right, L = 2+1+4 
+ 2 + 3=12 lattice periods.) 



Fig. 4.12. A Bloch wall in a 2D Ising system. 


Evidently, the additional energy associated with such wall is Fw = 2 JL, while wall’s entropy may 
be estimated approximately using the following reasoning. Fet the wall be formed by the path of a 
“•Manhattan pedestrian” traveling through the lattice between its nodes. At each junction, the pedestrian 


44 If the reader is still uncomfortable with this core result of thermodynamics, he or she is strongly encouraged to 
revisit Eq. (1.43) and its discussion. 
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may select 3 choices of 4 directions (except the one that leads backward), so that there are 
approximately 3 (/ '" h options for a walk starting from a certain point, i.e. approximately M ~ 2 (N - 
1) x3 « 2N 3 different walks starting from two sides of a square-shaped lattice (of linear size N “). 
Again calculating Sw as in M, we get 

F w =E W -TS W *2JL-T[n(2N V2 x3 L )= L(2J -T\n3)-T[n(lN V2 ) . (4.94) 

1 /2 

Since L scales as N or higher, at N — » co the last tenn is negligible, and we see that sign of dF w IdL 
depends on whether the temperature is higher or lower than the following critical value 

T=—J«\.%2J. (4.95) 

c In 3 


At T < T c , the Free energy minimum corresponds to L — » 0, i.e. Bloch walls are free-energy-beneficial, 
and the system is in the ferromagnetic phase. 


So, for d = 2 the estimates predict a finite critical temperature of the same order as the Weiss’ 
theory ( T c = 4 J). The major approximation in the calculation leading to Eq. (95) is disregarding possible 
self-crossing of the “Manhattan walk”. An accurate counting of such self-crossings is rather difficult. It 
had been carried out in 1944 by L. Onsager; since then his calculations have been redone in several 
easier ways, but even they are rather cumbersome, and I will not have time to discuss then in detail. 45 
The final result, however, is surprisingly simple: 


tanh — = V2 - 1, 
T 


giving T c « 2.269 J , 


(4.96) 


Onsager’s 

result 


i.e. showing that the simple estimate (95) is only -20% off the mark. 


The Onsager solution, as well as all alternative solutions of the problem that were found later, 
are so “artificial” (2D-specific) that they do not give a clear clue to their generalization to other (higher) 
dimensions. As a result, the 3D Ising problem is still unsolved analytically. Nevertheless, we do kn ow 
T c for that case with an extremely high precision - at least to the 6 th decimal place. This has been 
achieved by numerical methods; they deserve a thorough discussion, are applicable to other problems as 
well. Conceptually, this task is rather simple: just compute, to the desired precision, the statistical sum 
of system (23): 


Z = 


i 

S;=±l 

7 = 1 , 2 ,... 


,N 


exp 



SjSj, 



(4.97) 


As soon as this has been done for a sufficient number of values of dimensionless parameters JIT and h/T, 
everything else is easy; in particular, we can compute the dimensionless function 

F IT = -InZ , (4.98) 

and then find the ratio J/T c as the smallest value of parameter JIT, at that FIT (as a function of ratio h/T) 
has a minimum at zero field. However, for any system of a reasonable size N, the “exact” computation 
of the statistical sum (97) is impossible, because it contains to many terms for any supercomputer to 


45 For that, the reader is referred to either Sec. 15 1 in the textbook by Landau and Lifshitz or Chapter 15 in the 
text by Huang, both cited above. 
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Master 

equations 


3 

handle. For example, let us take a relatively small 3D lattice with N = 10x10x10 = 10 spins, which still 
feature substantial boundary effects even using the periodic boundary conditions (similar to the Born- 
Karman conditions in the wave theory), so that its phase transition is smeared about T c by ~ 1%. Still, 
even for that crude model, Z would include 2 1 ' 000 = (2 10 ) 100 « (10 3 ) 100 = 10 l0 ° terms. Let us suppose we 
are using a prospective exaflops-scale computer performing 10 floating-point operations per second, 
i.e. -10 such operations per year. With those resources, the computation of just one statistical sum 
would require ~10 <300 " 26) = 10 274 years. To call such number “astronomic” would be a strong 
understatement. (As a reminder, the age of our Universe is believed to be close to 1.3x10 10 years - a 
very humble number in comparison.) 

This situation may be improved dramatically by noticing that any statistical sum, 


z = X' CX P] 



(4.99) 


is dominated by terms with lower values of E m . In order to find those lowest-energy states, we may use 
the following powerful approach (belonging to a broad class of Monte-Carlo techniques), which 
essentially mimics one (randomly selected) path of system’s evolution in time. One could argue that for 
that we would need to know the exact laws of evolution of statistical systems, 46 that may differ from one 
system to another, even if their energy spectra E m are the same. This is true, but since the equilibrium 
value of Z should be independent of these details, it may be evaluated using any kinetic model, provided 
that it satisfies certain general rules. In order to reveal these rules, let us start from a system with just 
two states, E m and E m ■ = E m + A - see Fig. 13. 


W 


E , = E_ + A 



W 


E 


m 


Fig. 4.13. Deriving the 
detailed balance equation. 


In the absence of quantum coherence between the states (see Sec. 2.1), equations for time 
evolution of the corresponding probabilities W m and W m - should depend only on the probabilities (plus 
certain constant coefficients). Moreover, since equations of quantum mechanics are linear, the equations 
of probability evolution should be also linear. Hence, it is natural to expect them to have the following 
form, 


dt 


= w m T i -w m r f , 


dW m . 

dt 


= W m T \~W m X i , 


(4.100) 


where constant coefficients Tf and Tj. have the physical sense of rates of the corresponding transitions - 
see Fig. 13. According to the master equations (100) the rates have simple meaning: for example, T'tdt 
is the probability of the system’s transition into state m ’ during an infinitesimal time interval dt, 
provided that in the beginning of that interval it was in state m with full certainty: W m = 1, W m > = 0. 47 


46 Discussion of such laws in the task of physical kinetics, which will be briefly reviewed in Chapter 6. 

47 The calculation of these rates for several particular cases is described in QM Secs. 6.6, 6.7, and 7.6. 
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Since for the system with just two energy levels, the time derivatives of the probabilities are 
equal and opposite, Eqs. (100) describe an (irreversible) redistribution of the probabilities while keeping 
their sum W = W m + W m ■ constant. At t — » oo, d!dt — > 0, and the probabilities settle to their stationary 
values related as 


W 

m 

W. 


(4.101) 


Now let us require that these stationary values obey the Gibbs distribution (2.58); then 


W , 

m 

W 

m 




< 1 . 


(4.102) 


Comparing these two expressions, we see that the rates have to satisfy the following detailed balance 
relation 



(4.103) 


By the way, this relation may serve as an important sanity check: the rates calculated using any 
reasonable model of a quantum system have to satisfy it. 48 


Now comes the final argument: since the rates of transition between two particular states should 
not depend on other states and their occupation, Eq. (103) has to be valid for each pair of states of any 
multi-state system. The detailed balance yields only one equation for two rates Tt and E|; if our only 
goal is the calculation of Z, the choice of the other equation is not too important. Perhaps the simplest 
choice is 


r(A) oc y(A) 


1, if A < 0, 

exp {-A IT], otherwise , 


(4.104) 


where A is the energy change resulting from the transition. This model, which evidently satisfies the 
detailed balance relation (103), is the most popular for its simplicity, despite the uphysical cusp this 
function has at A = 0. The simplicity of Eq. (104) enables the following Metropolis algorithm (Fig. 14). 
The calculation starts from setting a certain initial state of the system. At relatively high temperatures, 
the state may be generated randomly; for example, for the Ising system, the initial state of each spin Sj 
may be selected independently, with the 50% probability. At low temperatures, starting the calculations 
from the lowest-energy state (in particular, for the Ising model, from the ferromagnetic state Sj = sgn(A) 
= const) may give the fastest convergence of the sum (97). 


Now one spin is flipped at random, and the corresponding change of energy (A) is calculated, 49 
and plugged into Eq. (104) to calculate ]{A). Next, a pseudo-random number generator is used to 
generate a random number with the probability density uniformly distributed on segment [0, 1], (Such 


48 See, e.g., QM Eq. (7.196) for a quantum system bilinearly coupled to an environment in thermal equilibrium. 
By the way, that formula (as well as results for all realistic physical systems) does not feature the unphysical cusp 
of function T(A) at A = 0, assumed by the popular model (104). 

49 Note that the flip changes signs of only (2d + 1) terms in sum (23), i.e. does not require re-calculation of all (2d 
+1 )N terms of the sum, so that the computation of A takes just a few add-multiply operations even at N» 1. 
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relation 
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functions, typically called RND, are available in virtually any numerical library.) If the resulting % is 
less than y(A), the transition is accepted, while if £< y(A), it is rejected. In the view of Eq. (104), this 
means that any transition down the energy spectrum (A < 0) are always accepted, while those up the 
energy profile (A > 0) are accepted with the probability proportional to cxp j -AIT}. The latter feature is 
necessary to avoid system trapping in local minima of its multidimensional energy profile 
E m {s\,S 2 ,...,SN). Now the statistical sum may be calculated approximately as a partial sum over the states 
passed by the system. (It is better to discard the contributions from a few first steps to avoid an error due 
to the initial state choice.) 



Fig. 4.14. Crude scheme of the 
Monte Carlo algorithm for the 
Ising model simulation. 


reject 
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<j>t 

^compare^^ 

i 

A 

/ 

accept 

spin flip 



^ 

spin flip 


This algorithm is extremely efficient. Even with modest computers available in the 1980s, it has 
allowed to simulate a 3D Ising system of (128) spins to get the following result: J/T c « 0.221650 ± 
0.000005. For all practical purposes, this result is exact (so that perhaps the largest benefit of the 
possible analytical solution for the infinite 3D Ising system would be a virtually certain Nobel Prize for 
the author :-). Table 2 summarizes values of T c for the Ising model. Very visible is the fast improvement 
of prediction accuracy of the molecular-field theory - which is asymptotically correct at d — » oo. 


Table 2. Critical temperature T c (in the units of J) of the Ising model 
of a ferromagnet (J > 0) for several values of dimensionality d 


d 

Molecular- field theory - Eq. (72) 

Exact value 

Exact value’s source 

0 

0 

0 

Gibbs distribution 

1 

2 

0 

Transfer matrix theory 

2 

4 

2.269... 

Onsager’s solution 

3 

6 

4.513... 

Numerical simulation 
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Finally, I need to mention the renormalization-group (“RG”) approach, 50 despite its low 
efficiency for the Ising problem. The basic idea of this approach stems from the scaling law (30)-(31): at 
T = T c the correlation radius r c diverges. Hence, the critical temperature may be found from the 
requirement for the system to be spatially self-similar. Namely, let us form larger and larger groups 
(“blocks”) of adjacent spins, and require that all properties of the resulting system of the blocks 
approach those of the initial system, as T approaches T c . 

Let us see how does this idea work for the simplest nontrivial (ID) case, which is described by 
statistical sum (80). Assuming N to be even (which does not matter at N — > oo), and adding an 
inconsequential constant C to each exponent (for the purpose that will be clear later on), we may rewrite 
this expression as 


z= x n ^plfff+fSjs^+^s^+cl. 


s ,=±1 j=l,2,...N 


(4.105) 


Let us group each two adjacent exponents to recast this expression as a product 
A 


V/ V VI 




Z= Z n expj-^+s. L( Sj _ i+Sj+i )+h. 

•,=±1 7=2,4,.. .N l Zi L 1 1 


h _ _ 

+ ~5_/+i +2C 


2 T 


2 = z n 


and carry out the summation over two possible states of the internal spins sj explicitly: 

f h J ( \ h h „ f\ 

exp l2T^'- 1 + r^ 1 +s J +l > + T + w Sj+1 +2c j 

f h J ( \ h h . _ 

+ ex P{^7 S J - 1 " J A/-' + 5 j« + S J + ' + 2C 

|^(Vi +^ + i) + f} ex p|^(Vi +^ + i)+ 2C }- 


■j=± 1 j=2,4,...N 

(for odd j) 


= ^ n ^ cosh 

s j =±1 j=2,4,...N 

(for odd j) 


(4.106) 


(4.107) 


Now let us require this statistical sum (and hence all statistical properties of the system of 2-spin 
blocks) to be identical to that of the Ising system of N/2 spins, numbered by odd j: 

Z'= Z tl <=x P jA ; _,^,+Z Vl+ cj, (4.108) 

^■=±1 7 = 2 , 4, ...,7V 1 J 

(for odd Sj) 


with some different parameters h ’, J\ and C\ for all 4 possible values of Sj.\ = ±1 and s r \ = ±1. Since 
the right-hand part of Eq. (107) depends only on the sum {sj.\ + sj+ 1 ), this requirement yields only 3 
(rather than 4) independent equations for finding h ’, J\ and C’. Of them, equations for h ’ and J’ depend 
only on h and J (but not on C), 51 and may be presented in an especially simple form, 


50 Developed first in the quantum field theory in the 1950s, it was adapted to statistics by L. Kadanoff in 1966, 
with a spectacular solution of the so-called Kubo problem by K. Wilson in 1972, later awarded by a Nobel Prize. 

51 This might be expected, because physically C is just a certain constant addition to system’s energy. However, 
the introduction of that constant is mathematically necessary, because Eqs. (107) and (108) may be reconciled 
only if C A C. 
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RG 

equations 
for ID Ising 
model 

using notation 


*(1 + J ) 2 
(x + y)(\ + xy) ’ 


,_ y( x +y) 

y , 5 

1 + xy 


x = cxp< 



J = cxpi 



(4.109) 


(4.110) 


Now the grouping procedure may be repeated, with the same result ( 109)-(1 10). Hence these 
equations may be considered as recurrence relations describing repeated doubling of the spin block size. 
Figure 15 shows (schematically) the trajectories of this dynamic system on the phase plane [x, y]. (A 
trajectory is defined by the following property: for each of its points {x, y}, the point {x y ’} defined by 
the “mapping” Eq. (109) is also on the same trajectory.) For ferromagnetic coupling (J > 0) and h > 0, 
we may limit the analysis to the unit square 0 < x, y < 1 . If this flow diagram had a stable fixed point 
with x’ = x = Xao ^ 0 (i.e. T/J < oo) and y’ = y «* 1 (i.e. h = 0), then the first of Eqs. (110) would 
immediately give us the critical temperature of the phase transition in the field-free system: 


T =■ 


4 J 


ln(l/xj 


(4.111) 


However, Fig. 15 shows that the only fixed point of the ID system is x = y = 0, which (at finite coupling 
J ) should be interpreted as T c = 0. This is of course in agreement with the exact result of the transfer- 
matrix analysis, but does not give any additional information. 



Fig. 4.15. The RG flow 
diagram of the ID Ising 
system (schematically). 


Unfortunately, for higher dimensionalities the renormalization-group approach rapidly becomes 
rather cumbersome, and requires certain approximations, whose accuracy cannot be easily controlled. 
For 2D Ising system, such approximations lead to the prediction TJJ » 2.55, i.e. to a substantial 
difference from the exact (Onsager’s) result. 


4,6. Exercise problems 

4.1 . Compare the third virial coefficient C(T) for the hardball model of particle interactions, that 
follows from the van der Waals equation, with the exact result (whose calculation was the subject of 
Problem 3.18). 
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4.2 . Calculate the entropy and the internal energy of the van der Waals gas, and discuss the 

results. 


4.3 . Use two different approaches to calculate (dEldV)r for the van der Waals gas, and the change 
of temperature of such a gas, with temperature-independent CV, at its fast expansion into free space. 

4.4 . * Derive as many analytical results as you can for temperature dependence of the phase- 
equilibrium pressure Po(T) and the latent heat A(T) within the van der Waals model. In particular, 
explore the low-temperature limit (T « T c ), and the close vicinity of the critical point T c . 

4.5 . * Calculate the critical values P c , V c , and T c for the so-called Redlich-Kwong model of the 
real gas, with the following phenomenological equation of state: 52 

p + « = 

V(v + Nb)r' /2 V-Nb 

Hint: Be prepared to solve a cubic equation with particular (numerical) coefficients. 

4.6 . Use the Clapeyron-Clausius formula (4.17) to calculate the latent heat A of the Bose- 
Einstein condensate, and compare the result with that obtained in Problem 3.18. 

4.7 . In the simplest model of the liquid-gas equilibrium, 53 temperature and pressure do not affect 
molecule's condensation energy A. Calculate the concentration and pressure of the gas over liquid's 
surface, assuming that its molecules are classical, non-interacting particles. 

4.8 . Assuming the hardball model, with volume V 0 per molecule, for the liquid phase, but still 
treating the gaseous phase an ideal gas, describe how do the results of the previous problem change if 
the liquid phase is in the form of spherical drops of radius R » Vo . Briefly discuss the implications of 
the result for water cloud formation. 

4.9 . A classical ideal gas of N » 1 particles is placed into a container of volume V and wall 
surface area A. The particles may condense on container walls, loosing potential energy A per particle, 
and forming an ideal 2D gas. Calculate the equilibrium number of condensed particles and gas pressure, 
and discuss their temperature dependences. 

4.10 . The inner surfaces of the walls of a closed container of volume V, filled with N » 1 
indistinguishable particles, have Ns » 1 similar traps (small potential wells). Each trap can hold only 
one particle, at energy -A < 0. Assuming that the gas is ideal and classical, derive the equation for the 
chemical potential // of the system in equilibrium, and use it to calculate the potential and the gas 
pressure in the limits of small and large values of the ratio N/Ns- 


52 This equation of state, suggested in 1948, describes most real gases better than not only the original van der 
Waals model, but also its later 2-parameter alternatives, such as the Berthelot, modified-Berthelot, and Dieterici 
models, though approximations with more fitting parameters (such as the Soave-Redlich-Kwong model) work 
even better. 

53 For real liquids, the model is reasonable only within certain parameter ranges. 
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4.1 1 . Superconductivity may be suppressed by a sufficiently strong magnetic field. In the 
simplest case of a bulk, long cylindrical sample of the type-I superconductor, placed into an external 
magnetic field 3 ext parallel to its surface, this suppression takes a form of a simultaneous transition of 
the whole sample from the superconducting to the “normal” (non-superconducting) state at certain 
critical magnitude A(F). The critical field gradually decreases with temperature from the maximum 

value A(0) at T — » 0 to zero at the critical temperature T c . Assuming that the function 3 r ( T) is known, 
calculate the latent heat of transition as a function of temperature, and find its values at T — » 0 and T = 
T c . 

Hint : In this context, “bulk” means a sample much larger than the intrinsic length scales of the 
superconductor (such as the London penetration depth Sl and the coherence length g). 54 For such bulk 
samples, magnetic properties of the superconducting state may be well described just as a perfect 
diamagnetism, with zero magnetic permeability ju. 

4.12 . In some textbooks, the discussion of thermodynamics of superconductivity is started with 
displaying, as self-evident, the following fonnula: 

2Ao 

where F s and F„ are the free energy values in the superconducting and non-superconducting (“normal”) 
phases, and 3 C ( T) is the critical value of field. Is this formula correct, and if not, what modification is 
necessary to make it valid? Assume that all conditions of the simultaneous field-induced phase transition 
in the whole sample, spelled out in the previous problem, are satisfied. 

4.13 . In Sec. 4, we have discussed Weiss’ molecular-field approach to the Ising model, in which 
the lattice average (sj) plays the role of the order parameter //. Use the results of that analysis to find 
coefficients a and b in the corresponding Landau expansion (46) of the free energy. List the values of 
critical exponents a and /?, defined by Eqs. (26) and (28), within this approach. 

4.14 . Calculate the average order parameter and the low-field susceptibility x of a ring of three 
Ising-type “spins” (sy = ±1), with similar ferromagnetic coupling J between all sites, in thermal 
equilibrium. From the result, can you predict the low-temperature behavior of x in an A- spin ring? 

4.15 . Use Eq. (88) to calculate the average energy, free energy, entropy and heat capacity (all per 
lattice site), as functions of temperature T and field h, for the ID Ising model. Sketch the temperature 
dependence of the heat capacity for various values of ratio h/J, and give a physical interpretation of the 
result. 


4.16 . Use the molecular-field theory to calculate the critical temperature and low-field 
susceptibility of a J-dimensional cubic lattice of spins described by the so-called classical Heisenberg 
model : 55 


54 A brief discussion of these parameters, as well as of the difference between the type-I and type-II 
superconductivity, may be found in EM Secs. 6. 3-6.4. 

55 This model is formally similar to the genuine (quantum) Heisenberg model - see Eq. (21). 
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^=- J Zv s /-2>v 

U/} i 

Here, in contrast to the (otherwise, very similar) Ising model (23), the spin of each site is modeled by a 
classical 3D vector s, = \s XJ , s yj , s~j) of unit length: sf = 1 . 


Chapter 4 


Page 36 of 36 





Essential Graduate Physics 


SM: Statistical Mechanics 


Chapter 5. Fluctuations 

This chapter discusses fluctuations of statistical variables, mostly at thermodynamic equilibrium. In 
particular, I will describe the intimate connection between fluctuations and dissipation {damping) in a 
dynamic system weakly coupled to a multi-particle environment, which culminates in the Einstein 
relation between the diffusion coefficient and mobility, the Nyquist formula, and their quantum- 
mechanical generalization - the fluctuation-dissipation theorem. An alternative approach to the same 
problem, based on the Smoluchowski and Fokker-Planck equations, is also discussed in brief. 


5.1 . Characterization of fluctuations 


Fluctuation 


In the beginning of Chapter 2, we have discussed the notion of averaging, {/}, of a variable / 
over a statistical ensemble - see Eqs. (2.7) and (2.10). Now, the variable’s fluctuation may be defined 
simply as its deviation from the average: 


/ = /-{/); 


(5.1) 


this deviation is, evidently, also a random variable. The most important property of any fluctuation is 
that its average (over the same statistical ensemble) equals zero: 


( 7 ) = (/ - </» = (/> - ((f)) = {/) - (/) = 0. (5.2) 


As a result, such average cannot characterize fluctuations’ intensity, whose simplest characteristic is the 
variance (also called “dispersion”): 


Variance: 

definition 


( 7 2 )={(/-</» 2 ). 


(5.3) 


The following simple property of the variance is frequently convenient for its calculation: 



{ 7 2 }=((/-(/» ! ) 

= (f 2 - 2 /{/> + (fl ) = (f 2 ) - 2(f) 2 + (f) 1 , 

(5.4a) 


so that, finally: 




Variance 

via 

averages 


(f 2 ) = (f 2 )-(f) 2 - 


(5.4b) 


As the simplest example of its application, consider a variable which can take only two values, ±1, with 
equal probabilities Wj= 14. For such a variable, 


(/) = E»';7=h +1 > + h- 1 > = 0. but (/ J ) = Z»';7 2 =^ +1 > 2+ t(- 1 7= 1 . 

22 J 2 2 (5.5) 

so that ( 7 ! ) = (/ 2 )-(/) 2 =l. 


R.m.s. 

fluctuation 


The square root of variance, 



(5.6) 
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is called the root-mean-square ( r.m.s .) fluctuation. An advantage of this measure is that it has the same 
dimensionality as the variable itself, so that ratio Sf/(J) is dimensionless, and may be used to characterize 
the relative intensity of fluctuations. In particular, as has been mentioned in Chapter 1, all results of 
thermodynamics are valid only if the fluctuations of thermodynamic variables (internal energy E, 
entropy .S', etc.) are relatively small. 1 Let us make the simplest estimate of the relative intensity of 
fluctuations by considering a system of N independent, similar particles, and an extensive variable 

/ = £/,■ (5.7) 

j = 1 

where f depends on the state of just one (/ th ) particle. The statistical average of zfis evidently 


while the variance is 


t = i 



Now we may use the fact that for two independent variables 



(5.8) 


(5.9) 


(5.10) 


actually, this equation may be considered as the mathematical definition of the independence. Hence, in 
the sum (9), only the terms with j ’ =j survive, and 

f)=£{ff,,f= N (f 2 )- (5.11) 

jj '= 1 

Comparing Eqs. (8) and (1 1), we see that the relative intensity of fluctuations of variable f, 

Relative 

(5.12) fluctuation 
estimate 


tends to zero as the system size grows ( N — » oo). It is this fact that justifies the thermodynamic approach 
to typical physical systems, with the number N of particles of the order of the Avogadro number N\ ~ 
10 24 . Nevertheless, in many situations even small fluctuations of thermodynamic variables are 
important, and in this chapter we will calculate their basic properties, starting from the variance. 

It will be pleasant for the reader to notice that for some simple (but important) cases, such 
calculation has already been done in our course. For example, for any generalized coordinate qj and 
generalized momentum pj that give quadratic contributions to system’s Hamiltonian (2.46), we have 
derived the equipartition theorem (2.48), valid in the classical limit. Since the average values of these 


S? _ 1 8f 

(f) = A 1/2 (/>’ 


1 Let me remind the reader that up to this point, the averaging signs (...) were dropped in most formulas, for the 

sake of notation simplicity. In this chapter I have to restore these signs to avoid confusion. The only exception 
will be temperature whose average, following (bad :-) tradition, will be still call T everywhere besides the last part 
of Sec. 3 where temperature fluctuations are discussed explicitly. 
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variables, in the thermodynamic equilibrium, equal zero, Eq. (6) immediately yields their r.m.s. 
fluctuations: 


Spj = (mT) V2 , 8q J = 


( T V /2 

\mco 2 j 


(5.13) 


The generalization of these classical relations to the quantum-mechanical case (T ~ tied) for a ID 
hannonic oscillator is provided by Eqs. (2.78) and (2.81): 


Spj = 


tun 6) , 

coth 

2 


tico 
2 T 


1 1/2 


Sqj = 


ti , tioo 

coth — 

2m co 2 T 


(5.14) 


However, the intensity of fluctuations in other systems requires special calculations. Moreover, 
only a few cases allow for general, model-independent results. Let us review some of them. 


5.2. Energy and the number of particles 

First of all, note that fluctuations of macroscopic variables depend on particular conditions. 2 For 
example, in a mechanically- and thermally-insulated system, e.g., a member of a microcanonical 
ensemble, there are no fluctuations of internal energy: 5E= 0. 

However, if a system is in a thermal contact with environment, for example is a member of a 
canonical ensemble (Fig. 2.6), the Gibbs distribution (2.58)-(2.59) is valid. We already know that 
application of this distribution to energy itself, 

{E) = Y j V.E„ W.= Iexp{-^}, Z = £ex p{-^-|. (5.15) 


yields Eq. (2.61b), which may be rewritten in the form 

1 dZ 


E) = 


1 


, with ft = —, 
Z d(-P) T 


(5.16) 


more convenient for our current purposes. Now let us carry out a similar calculation for variable E 2 : 

{E 1 ) = Z W ~ E l = fZ E l ex (5-17) 


It is straightforward to check, by double differentiation, that this expression may be rewritten as 

i a 2 v- i- i 1 d 2 Z 


E 2 ) = 


-£exp{— pE n 


\2 ' 


zd{-p ) 2 ^ ' "" z d(-py 

Now it is straightforward to use Eq. (4) to calculate the energy fluctuation variance: 


(5.18) 


e 2 ) = (e 2 )-(e) 2 = 


1 d 2 Z 1 

' dZ "1 

2 

d 

( \ dZ ^ 

d(E) 

Z d(-p) 2 Z 2 

{d(-P)J 

~ d(-P) 

{zd(-p)j 

d(-P) 


(5.19) 


2 Unfortunately, even in some popular textbooks, a few formulas pertaining to fluctuations are either incorrect, or 
given without specifying the conditions of their applicability, so that reader’s caution is advised. 
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Since Eq. (15) is valid only if system’s volume V is fixed, it is customary to rewrite this extremely 
simple and important result as follows: 


E ) = 


d { E ) =J ,( d ( E ) 


d(-l IT) 


dT 


= C V T . 


Jv 


(5.20) 


This is a remarkably simple, fundamental result. As a sanity check, for a system of N similar, 

1/9 1/9 

independent particles, (E) and hence Cy and are proportional to N, so that SE oc N and SE/(E) oc N~ , 
in agreement with Eq. (12). Let me emphasize that the classically-looking Eq. (20) is based on the 
general Gibbs distribution, and hence is valid for any system - either classical or quantum. 

We will discuss the corollaries of this result in the next section, and now let me carry out a very 
similar calculation for a system whose number N of particles in a system is not fixed, because they may 
go to, and come from the environment at will. If the chemical potential /u of the environment and its 
temperature T are fixed, we are dealing with the grand canonical ensemble (Fig. 2.13), and may use the 
grand canonical distribution (2. 1 06)-(2. 107): 




Z G =Z eX P' 


fM-E w 


(5.21) 


N,m 


Acting exactly as we did above for energy, we get 

W = ^Z^ ex p|^-A- 


J G m,N 


N-) = 


~~ X E 2 exp 

“ G m,N [ 



T dZ G 
Z G dju ’ 

T 2 d 2 Z c 

Z G 


(5.22) 


(5.23) 


so that the particle number variance is 


n 2 ) = (n 2 )-(n) 2 = 


T 2 dZ G T 2 


1 -T d 

( T dZ G ^ 

_ T d W 

Z G d/a Z G 

l d E ) 

d/i 

,Z G dju ) 

d/i 


(5.24) 


in the full analogy with Eq. (19). 

For example, for the ideal classical gas we had Eq. (3.32). As was already emphasized in Sec. 
3.2, though that result has been obtained from the canonical ensemble in that the number of particles N 
is fixed, at N » 1 the fluctuations of N in the grand canonical ensemble should be relatively small, so 
that the same relation should be valid for average (N) in that ensemble. Solving that relation for (A), we 
get 


(A) = const x expj^-j 


(5.25) 


where “const” means a factor that is constant at the differentiation of (N) over /u, required by Eq. (24). 
Performing the differentiation and then using Eq. (25) again, 


d(N) 

d/a 


= const x 


IexpH-W 

t p |rJ t 


(5.26) 


Fluctuations 
of E 


Fluctuations 
of N 
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Fluctuations 
of N in 
classical gas 


Binomial 

distribution 


we get from Eq. (24) a surprisingly simple result: 

^V 2 ^ = (7V), i.e. 5N = (A^) 1/2 


(5.27) 


This relation is so simple and important that I will now show how it may be derived in a different 
way, in order to prove that this result is valid for systems with an arbitrary (say, small) N, and also get 
more detailed information about the statistics of fluctuations of that number. Let us consider an ideal 
classical gas of No particles in a volume Vo, and calculate the probability W N to have exactly N < No of 
these particles in a part V < Vo of this volume - see Fig. 1 . 



Fig. 5.1. Deriving the binomial 
and Poissonian distributions. 


For one particle such probability is of course W= VI Vo < 1, while the probability of one particle 
being in the remaining part of the volume isW’=l-W=l - VI Vo. If all particles were distinguishable, 
the probability of having N<No specific particles in volume V, and ( N - No) specific particles in volume 
(V - V 0 ), would be lV N W’ <w f ] . However, if we do not distinguish the particles, we should multiply the 
probability by the number of possible particle combinations keeping numbers N and N 0 constant, i.e. by 
the binomial coefficient No\/N\(No - N ) ! , 3 As the result, the required probability is 


T V N W ,(N 0 -N) N 0 ! 

rw) 

N 

r, <a 

N 0 -N 

N 1 

JV 0 . 

N\(N 0 -N)\ 



h 

O 

N\(N 0 -N)\ 


(5.28) 


where in the second instance I have used the evident expression (TV) = WNo = (V/Vq)No for the average 
number of particles in volume V. Relation (28) is the so-called binomial probability distribution, valid 
for any (N) and No. 


If we are interested in keeping ( N) arbitrary, but do not care how large the additional volume (Vo 
- V) is, we can simplify the binomial distribution by assuming that the external part, and hence No, are 
very large: 


N 0 »N, 


(5.29) 


where N means all values of interest, including (N). In this limit we can neglect N in comparison with No 
in the second exponent of Eq. (28), and also approximate the fraction jVb!/(iVo -N)\, i.e. the product of N 
terms, (No -N+ 1) (No -N+ 2)... (No - 1 )No , as just Nq N . As a result, we get 


W u 


MYt- Wf < = Wlfj. Wf = W 


V^oy V 


N, 


0 J 


N\ 


N\ 


N, 


o 


N\ 


(1 -W) 


lW 


(5.30) 


3 See, e.g., MA Eq. (2.2). 
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In the limit (29), W — » 0, and factor inside the square brackets tends to 1/e, the reciprocal of the natural 
logarithm base. 4 * Thus, we finally get an expression independent of No: 

Poisson 

(5.31) distribution 


This is the much celebrated Poisson distribution, which describes a very broad family of random 
phenomena. Figure 2 shows this distribution for several values of (N) - which, in contrast to N, are not 
necessarily integer. 




0 2 4 6 8 10 12 14 


N 


Fig. 5.2. The Poisson distribution for 
several values of (N). In contrast to 
that average, argument N may take 
only integer values, so that lines are 
only guides for the eye. 


At very small (. N ), function Wn{N) distribution is close to an exponential one, W N ~ W N oc (N) n , 
while in the opposite limit, (TV) » 1 , it rapidly approaches the Gaussian (alternatively called “normal”) 
distribution 


W„ 


1 exp|- (iV ~W )2 l 
(2 4 I2 sn p j 2 my J’ 


with SN = (N) 1 ' 2 . 


(5.32) 


Gaussian 

distribution 


(Note that the Gaussian distribution is also valid if both N and N 0 are large, regardless of relation (29) 
between them - see Fig. 3.) 



4 Indeed, this is the most popular definition of this major mathematical constant - see, e.g., MA Eq. (1.2a) with n 

replaced with -1/W. 
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The key property of the Poisson (and hence of the Gaussian) distribution is that it has the same 
variance as given by Eq. (27): 

(n 2 ) = ((n-(N)Y} = (N). (5.33) 

(This is not true for the general binomial distribution.) For our current purposes, this means that for the 
ideal classical gas, Eq. (27) is valid for any number of particles. 


5.3. Volume and temperature 

What are the r.m.s. fluctuations of other thennodynamic variables - like V, T, etc.? Again, the 
answer depends on conditions. For example, if the volume V occupied by a gas is externally fixed (say, 
by rigid walls), it evidently does not fluctuate at all: SV = 0. On the other hand, the volume may 
fluctuate in the situation when average pressure is fixed - see, e.g., Fig. 1.5. A formal calculation of 
these fluctuations, using the approach applied in the last section, is hampered by the fact that it is 
physically impracticable to fix its conjugate variable, P, i.e. suppress its fluctuations. For example, the 
force f{t) exerted by an ideal classical gas on vessel’s wall (whose measure the pressure is) is the result 
of individual, independent hits of the wall by particles (Fig. 4), with time scale t c ~ r B /(77m) 1/2 ~ 10" 16 s, 
so that its frequency spectrum extends to very high frequencies, virtually impossible to control. 



Fig. 5.4. Force exerted by gas 
particles on container’s wall, as a 
function of time (schematically). 


However, we can use the following trick, very typical for the theory of fluctuations. It is almost 
evident that r.m.s. fluctuations of volume are independent of the shape of the container. Let us consider 
the particular situation similar to that shown in Fig. 1.5, with the container of a cylindrical shape, with 
the base area A. 5 Then the coordinate of the piston is just q = VIA, while the average force exerted by the 
gas on the cylinder is ? = PA - see Fig. 5. Now if the piston is sufficiently massive, its free oscillation 
frequency co near the equilibrium position is small enough to satisfy the following three conditions. 

First, besides balancing the average force {f), and thus sustaining average pressure (P) = (f)/A 
of the gas, the interaction between the heavy piston and light molecules of the gas is weak because of a 
relatively short duration of the wall hits (Fig. 4). Because of that, the full energy of the system may be 
presented as a sum of those of the gas and the piston, with a quadratic contribution to piston’s potential 
energy from small deviations of equilibrium: 

U P =^q 2 , q=q~{q) = (5.34) 


5 As a reminder, in geometry the term “cylinder” does not necessarily means the “circular cylinder”; the shape of 
base A may be arbitrary; it just should not change with height. 


Chapter 5 


Page 7 of 42 





Essential Graduate Physics 


SM: Statistical Mechanics 


where k is the effective spring constant arising from gas’ compressibility. 


i f = PA 

I 


A 


q = 


v_ 

A 


v = (v) + m 


Fig. 5.5. Deriving Eq. (37). 


Second, at co — » 0, that spring constant may be calculated just as for constant variations of 
volume, with the gas remaining in quasi-equilibrium at all times: 


8 ^ = J- d i P l) 

dq { 0(K), ' 


(5.35) 


This partial derivative 6 should be taken at whatever the given thermal conditions are, e.g., with S = const 
for adiabatic conditions (i.e., thermally insulated gas), or with T = const for isothermic conditions (gas 
in a good thermal contact with a heat bath), etc. With that constant denoted as X, Eqs. (34)-(35) give 


1 

A ,W) 

fv] 

2 

_ 1 

r d(p)') 

2 

8 { v )), 

\ A ) 

2 

8 (y)\ 


(5.36) 


Finally, making co sufficiently small (namely, fico « T) by a sufficiently large piston mass, we can 
apply, to the piston’s fluctuations, the classical equipartition theorem: (Up) = 772, giving 



"ref 

1 


8 ( p )) 

X 


(5.37a) 


Since this result is valid for any A and co, it should not depend on system’s geometry and piston 
mass, provided that it is large in comparison with the effective mass of a single system component (say, 
a gas molecule) - the condition that is naturally fulfilled in most experiments. 7 For the particular case of 
fluctuations at constant temperature (X= T), we may use the second of Eqs. (1.39) to rewrite Eq. (37a) 
as 


6 As already was discussed in Sec. 4.1 in the context of the van der Waals equation, for mechanical stability of a 
gas (or liquid), derivative dP/dV has to be negative, so that a: is positive. 

7 One may meet statements that a similar formula, 


P-) = T 


d(P 

W) 


WRONG! 


J x 


is valid for pressure fluctuations. However, such statement does not take into account a different physical nature 
of pressure (Fig. 4), with its very broad frequency spectrum. This issue will be discussed later in this chapter. 


Fluctuations 
of volume 
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-T 


f , \ 

d 2 G 


s(p 


Jt 


(5.37b) 


In the specific case of an ideal classical gas of N particles, with the equation of state (V) = NT/(P), it is 
easier to use directly Eq. (37a), withX= T, to get 




1 

N 1 ' 2 ’ 


(5.38) 


in agreement with the trend given by Eq. (12). 

Now let us proceed to fluctuations of temperature, for simplicity focusing on the case V= const. 
Let us again assume that the system we are considering is weakly coupled to a heat bath of temperature 
To, in the sense that the time z of temperature equilibration between the two is much larger than the 
internal temperature relaxation ( thermalization ) time. Then we may assume that T changes in the whole 
system virtually simultaneously, and consider it a function of time alone: 

T = (T) + T(t). (5.39) 


Moreover, due to the (relatively) large r, we may use the stationary relation between small fluctuations 
of temperature and the internal energy of the system: 


T(t) = 


m 

C, : 


so that ST = 


SE_ 

C, 


(5.40) 


Fluctuations 
of temperature 


With those assumptions, Eq. (20) immediately yields the famous expression for the so-called 
thermodynamic fluctuations of temperature : 



Ti 


(5.41) 


The most straightforward application of this result is to analysis of so-called bolometers - 
broadband detectors of electromagnetic radiation in microwave and infrared frequency bands. In such a 
detector (Fig. 6), the incoming radiation it focused on a small sensor (e.g., either a small piece of a Ge 
crystal, or a superconductor thin film at temperature T « T c , etc.) that is well isolated thermally from the 
environment. 


■p 



Fig. 5.6. Conceptual scheme of a bolometer. 


As a result, the absorption of even small radiation power V leads to a noticeable change AT of 
sensor’s average temperature ( T) and hence of its electric resistance R, which is probed up by low-noise 
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external electronics. 8 If power does not change in time too fast, AT is a certain function of P, turning 
into 0 at P= 0. Hence, if AT is much lower than the environment temperature To, we may keep only the 
main, linear term in it Taylor expansion in P. 

AT = (T)-T 0 = ^~, (5.42) 

I 


where coefficient ff = dPGT is called the thermal conductance of the unavoidable thennal coupling 
between the sensor and the heat bath - see Fig. 6. The power may be detected if the electric signal from 
the sensor, which results from change AT, is not drowned in spontaneous fluctuations. In practical 
systems, these fluctuations are is contributed by several sources including electronic amplifiers, sensor, 
etc. However, in modern systems these “technical” contributions to noise are successfully suppressed, 
and the dominating noise source are the fundamental fluctuations of sensor temperature, described by 
Eq. (41). In this case the so-called noise-equivalent power (“NEP”), defined as the level of P that 
produces signal equal to r.m.s. value of noise, may be calculated by equating Eqs. (41) (with (I) « T 0 ) 
and (42): 

TA 

NEP = ■ (5.43) 

v 


This expression shows that in order to decrease NEP, i.e. improve the device sensitivity, both the 
environment temperature To and thermal conductance f should be reduced. In modern receivers of 
radiation, their typical values (in SI units) are of the order of 0. 1 K and 10’ 10 W/K, respectively. 

On the other hand, Eq. (43) implies that in order to increase bolometer sensitivity, i.e. reduce 
NEP, the Cv of the sensor, and hence its mass, should be increased. This conclusion is valid only to a 
certain extent, because due to technical reasons (parameter drift and the so-called 1 If noise of the sensor 
and external electronics), incoming power has to be modulated with as high frequency co as possible (in 
most cases, the cyclic frequency v = cot In of the modulation is between 10 to 1,000 Hz), so that the 
electrical signal may be picked up from the sensor at that frequency. As a result, Cy may be increased 
only until the thennal constant of the sensor, 


r 



(5.44) 


becomes close to 1 Icq, because at cot » 1 the useful signal drops faster than noise. As a result, the 
lowest (i.e. the best) value of NEP, 


(NEP) r 

-. 1/2 


= aT, 


. 1/2 


a ~ 1 , 


(5.45) 


is reached at vz ~ 1 . (The exact values of the optimal product cot, and the numerical constant a ~ 1 in 
Eq. (45), depend on the exact law of power modulation in time, and the output signal processing 
procedure.) With the parameters cited above, this estimate yields (NEP) m i n / v* /2 ~ 3xl0~ 17 W/Hz 1/2 - a 
very low power indeed. 


8 Besides low internal electric noise, the sensor should have a sufficiently large temperature responsivity dR/dT, 
making the noise contribution by the pickup electronics insignificant - see below. 
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However, surprisingly enough, the power modulation allows bolometric (and other broadband) 
receivers to register radiation with power much lower than this NEP! Indeed, picking up the sensor 
signal at the modulation frequency co, we can use the following electronics stages to filter out all the 
noise besides its components within a very narrow band, of width Av « v, around the modulation 
frequency (Fig. 7). This is the idea of a microwave radiometer , 9 currently used in all sensitive 
broadband receivers. 


input 

power modulation 



Fig. 5.7. Basic idea of the Dicke radiometer. 


In order to analyze this opportunity, we need to develop theoretical tools for a quantitative 
description of the spectral distribution of fluctuations. Another motivation for that description is the 
need in analysis of variables dominated by fast (high-frequency) components, such as pressure - please 
have one more look at Fig. 4. Finally, during the analysis, we will run into the fundamental relation 
between fluctuations and dissipation, which is one of the main results of statistical physics as a whole. 


5.4. Fluctuations as functions of time 

There are two mathematically-equivalent approaches to time-dependent functions of time, called 
time-domain and frequency-domain pictures, with their relative convenience depending on the particular 
problem to be solved. 

In the time domain, we cannot characterize a random fluctuation / (t) of a classical variable by 
its statistical average, because it equals zero - see Eq. (2). Of course, variance (3) does not vanish, but if 
fluctuations are stationary, it does not depend on time either. Because of that, let us consider the 
following average: 10 

(/©/(Of (5.46) 

Generally, this is a function of two arguments. Moreover, in the systems that are stationary (whose 
macroscopic parameters and hence the variable expectation values do not change with time), averages 
like (46) may depend only on the difference, 

t = t'-t, (5.47) 


9 It was pioneered in the 1950s by R. Dicke, so that the device is frequently called the Dicke radiometer. 

10 Clearly, this is a temporal analog of the spatial correlation function discussed in Sec. 4.2 - see Eq. (4.30). 
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between the two observation times. In this case, average (46) is called the correlation 
variable f 

K f (T) = {f(t)f(t + T)). 


function of 


(5.48) 


This name 11 catches the idea of this notion very well: Kf z) tells us about the average mutual relation 
between the fluctuations at two times separated by interval r. Let us list the basic properties of this 
function. 


First of all, Kf(r ) has to be an even function of the time delay r. Indeed, we may write 

K f (-r) = [f{t) f{t - x)) = (fit - x)f(t)) = (fit') fit' + x)) , (5.49) 


with t’ = t - x. For stationary processes, this average cannot depend on the common shift t ’ of the two 
observation times, so that averages (48) and (49) have to be equal: 

K f i-x) = K f ix). (5.50) 

Second, at r — > 0 the correlation function tends to the variance: 

K f i0) = (fit)fit)) = (f 2 ). (5.51) 

In the opposite limit, when z is much larger than some characteristic correlation time r c of the system, 12 
the correlation function tends to zero, because fluctuations separated by such large time interval are 
virtually independent iuncorr elated). As a result, the correlation function typically looks like one of the 
plots sketched in Fig. 8. Note that on a time scale much longer than x c , any physically-realistic 
correlation function may be well approximated with a delta- function of r. 13 


Fig. 5.8. Correlation function of 
fluctuations: two typical examples. 


In the reciprocal, frequency domain, process / it) is presented as a Fourier integral, 

+oo 

fit)= \Le~ iat dco, (5.52) 

-00 

with the reciprocal transform being 


1 1 Another term, the autocorrelation function, is sometimes used for average (48) to distinguish it from the mutual 
correlation function, (fifgit + r)), of two stationary processes. 

12 Correlation time x c is the direct temporal analog of the correlation radius r c which was discussed in Sec. 4.2. 

13 For example, for a process which is a sum of independent very short pulses, e.g., the gas pressure force exerted 
on the container wall (Fig. 4), such approximation is legitimate on time scales longer than the single pulse 
duration, e.g., the time of particle’s impact on the wall. 



Correlation 

function 


Chapter 5 


Page 12 of 42 


Essential Graduate Physics 


SM: Statistical Mechanics 


foj = 


+oo 

— \f{t)e ia dt. 


(5.53) 


-oo 

If the initial function / (t) is random (as it is in the case of fluctuations), with zero average, its Fourier 
transform/a, is a random function (now of frequency) as well, also with a vanishing statistical average: 


/ i +oo \ +oo 

I— \f(t)e ia dt\ = — J ( f(t))e iM dt = 0 . (5.54) 


The simplest nonvanishing average may be formed similarly to Eq. (46), but with due respect to the 
complex-variable character of the Fourier images: 

+00 +CO 

(fa,f^ = 7^\ dt '\ dt {f^ho)e K<or ~ M) ■ (5.55) 

' ' V-K) — CO — CO 

It turns out that for a stationary process, averages (46) and (55) are directly related. Indeed, since 
the integration over t’ in Eq. (55) is in infinite limits, we may replace it with integration over r = t’ - t 
(at fixed t), also in infinite limits. Replacing t’ by t + r in expressions under the integral, we see that 
the average is just the correlation function Kf(r), while the time exponent is equal to exp{z’(<z>’ - 
co)t}exp{ia>’T}. As a result, changing the order of integration, we get 

+00 +00 I +oo +oo 

J dt\ dTK f {T)e i(0J ~° y)t e ibJ ' T = - — — J K f (r)e i( °' T dz J e i(oJ ~ OJ,)t dt . (5.56) 

— 00 —00 J _ Q0 -GO 

But the last integral is just 2kS{co - <z>’), 14 so that we finally get 

Spectral 
density of 
fluctuations 


where the real function of frequency, 




POO | 

\KJT)e ia,T dr = - 
1 

o 

K f (T)cosandT , 

0 

(5.58) 

Khinchin ' s ca " cc ^ t ^ e spectral density of fluctuations at frequency co. According to Eq. (58), the spectral density is 
theorem a Fourier image of the correlation function, and hence the reciprocal Fourier transfonn is: 1546 


ll 

oc 

S f {co)e~ iwT dco = 2| 

S f (a>) cos cot dco. 

(5.59) 


In particular, for the variance, Eq. (59) yields 


fcof co ’} = S f (co)S(CD-(D'), 


(5.57) 



14 See, e.g., MA Eq. (14.4a). 

15 The second form of Eq. (59) uses the fact that, according to Eq. (58), S/(co) is an even function of frequency - 
just as K/( f) is an even function of time. 

16 Although Eqs. (58) and (59) look not much more than straightforward corollaries of the Fourier transform, they 
bear a special name of the Wiener-Khinchin theorem - after mathematicians N. Wiener and A. Khinchin who have 
proved that these relations are valid even for functions J{t) which are not square-integrable, so that from the point 
of view of rigorous mathematics, their Fourier transforms are not well defined. 


Chapter 5 


Page 13 of 42 


Essential Graduate Physics 


SM: Statistical Mechanics 


(f 2 ^j = K f ( 0) = \S f (co)do) = 2\S f (co)d(o. (5.60) 

-CO 0 

This relation shows that term “spectral density” describes the physical sense of function 5/<z>) 
very well. Indeed, if a random signal f(t) had been passed through a frequency filter with a small 
bandwidth Av « v of positive cyclic frequencies, the integral in Eq. (60) had to be limited to interval 
A a> = 2 7i A v, i.e. that the variance of the output signal would become 17 

(f 2 ) Ay =2S f (o))Ao) = 47rS f (o))Av. (5.61) 

To complete this introductory section, let me note an important particular case. If the spectral 
density of some process, is nearly constant within the frequency range of interest, 5/<z>) = const = 5/0), 18 
Eq. (59) shows that its correlation function may be well approximated by a delta-function: 

+co 

K f ( r) = S f (0)je~ iwT dto = 2nS f (0)S(r) . (5.62) 

-oo 

From this relation stems another popular name of the white noise, the delta-correlated process. We have 
already seen that this is a very reasonable approximation, for example, for the gas pressure force 
fluctuations (Fig. 4). Of course, for spectral density of a realistic, limited physical variable the 
approximation of constant spectral density cannot be true for all frequencies (otherwise, for example, 
integral (60) would diverge, giving an unphysical, infinite value of variance), and is valid only at 
frequencies much lower than 1/ z c . 


5.5. Fluctuations and dissipation 

Now we are mathematically equipped to address one of the most important topics of statistical 
physics, the relation between fluctuations and dissipation This relation is especially simple for the 
following hierarchical situation: a relatively “heavy”, slowly moving system interacting with an 
environment consisting of rapidly moving, “light” components. A popular theoretical term for such a 
system is the Brownian particle, named after botanist R. Brown who first noticed in 1827 the random 
motion of pollen grains, caused by their random hits by fluid molecules, under a microscope. However, 
the family of such systems is much broader than that of mechanical particles. 19 

One more important assumption of this theory is that the system’s motion does not violate the 
thennal equilibrium of the environment - well fulfilled in many cases. (Think, for example, about a 
usual mechanical pendulum whose motion does not overheat the air around it.) In this case, the 
statistical averaging over the thermally-equilibrium environment may be performed for any (slow) 


17 A popular alternative definition of the spectral density is 5/ v) = 4/rS/fti), making average (61) equal to 5/ v)A v. 

18 Such process is frequently called white noise, because it consists of all frequency components with equal 
amplitudes, reminding the white light, which consists of many monochromatic components. 

19 Just for one example, such description may be valid for the complex amplitude of an electromagnetic field 
mode weakly interacting with matter. To emphasize this generality, 1 will use letter q rather than x for “particle’s” 
coordinate. 
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motion of the system of interest, considering the motion fixed. 20 I will denote such a “primary” 
averaging by angular brackets (...). At a later stage we may carry out another, “secondary” averaging, 
over an ensemble of many similar systems of interest, coupled to similar environments. If we do, it will 
be denoted by double angle brackets «...)). 

Let me start from a simple classical system, a ID harmonic oscillator whose equation of 
evolution may be presented as 

mq + Kq = f det (t) + f mv ( t ) = f Aa ( t ) + (?) + fit) , (5 .63) 

where q is the (generalized) coordinate of the oscillator, f detit) is the detenninistic (generalized) external 
force, while both components of the random force fit) present the impact of the environment on 
oscillator’s motion. Again, from the point of view of the fast-moving environmental components, the 
oscillator’s motion is slow. The average of the force exerted by environment on such a slowly moving 
object may have a part depending on not only q, but on the velocity q as well. For most systems, the 
Taylor expansion of the force in small velocity would have a finite leading, linear term, so that we may 
take 

(f) = -qq, (5.64) 

so that Eq. (63) may be rewritten as 

Langevin 
equation 
for classical 
oscillator 

This way of describing the effects of environment on an otherwise Hamiltonian system is called 
the Langevin equation. 21 Due to the linearity of the differential equation (65), its general solution may 
be presented as a sum of two parts: the detenninistic motion of the linear oscillator due to the external 
force fdetit), and random fluctuations due to the random force exerted by the environment. The former 
effects are well known from classical dynamics, 22 so let us focus on the latter part by taking f detit) = 0. 
The remaining term in the right-hand part describes the fluctuating part of the environmental force; in 
contrast to the average component (64), its intensity (read: its spectral density at relevant frequencies co 
~ coo = irc/m) ) does not vanish at qit) = 0, and hence may be evaluated ignoring system’s motion. 

Plugging into Eq. (65) the presentation of both variables in the form similar to Eq. (52), for their 
Fourier images we get the following relation: 

-mco 2 q 0} -ia>r\q (0 + Kq a =f c0 . (5.66) 


mq + ?i q + Kq = f del it) + fit). 


(5.65) 


which immediately gives us q co \ 


20 For a usual (ergodic) environment, the primary averaging may be interpreted as that over relatively short time 
intervals, z c « At « r, where t c is the correlation time of the environment, while r is the characteristic time 
scale of motion of our “heavy” system of interest. 

21 After P. Langevin whose 1908 work was the first systematic development of A. Einstein’s ideas on Brownian 
motion (see below) using this formalism. A detailed discussion of this approach, with numerical examples of its 
application, may be found, e.g., in the monograph by W. Coffey, Yu. Kalmykov, and J. Waldron, The Langevin 
Equation, World Scientific, 1996. 

22 See, e.g., CM Sec. 4.1. In this and the next sections I assume that variable Jit) is classical, with the discussion of 
the quantum case postponed until Sec. 6. 
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Vo, 


( k -mco 2 )-ir](o 


(5.67) 


Now multiplying Eq. (67) by its complex conjugate, averaging both parts of the resulting equation, and 
using for each of them Eq. (57), 23 we get the following relation between spectral densities of the 
oscillations and force: 


S q (0J) 


(, K-mco 2 y +(rja>)' 


-SJoj). 


(5.68) 


As the reader should kn ow well from classical dynamics, at small damping (rj « mad) the first 
factor in the right-hand part of Eq. (68) describes the resonance, i.e. has a sharp peak near oscillator’s 
eigenfrequency at d, and may be presented in that vicinity as 


1 ~ 1 

(/c - mat 2 ) 2 + (r/a>y 4 m /e(g 2 + 8 2 ) 


at E, « a> 0 with E, = at- a> 0 , 8 = rj !2m . (5.69) 


In contrast, spectral density Sf(co) of fluctuations of a typical environment is changing slowly near that 
frequency, so that for the purpose of integration over frequencies near a>o we may replace S/( co) with S/ 
(or>). As a result, the variance of the environment-imposed random oscillations may be calculated as 


((r))=2js>)^* 

o 



(i at)da> ~ 2 S f (a> 0 ) 


i y 

4m k y c 2 + 8 2 


(5.70) 


The last expression includes a well-known table integral, 24 equal to nt 8- Imn/rj, so that finally 


(( r ))=2 s , k ) 


1 2 Tun 

4m k rj 


— S r (m 0 ). 
rcr/ 


(5.71) 


But on the other hand, the weak interaction with environment should keep the oscillator in 
thennodynamic equilibrium at the same temperature T. Since our analysis has been based on the 
classical Langevin equation (65), we may only use it in the classical limit fiato « T, in which we may 
use the equipartition theorem (2.48). In our current notation, it yields 


K 

~2 



r 

~ 2 ' 


(5.72) 


Comparing Eqs. (71) and (72), we see that the spectral density of the random force exerted by 
environment is fundamentally related to the damping it provides: 


S f (a> 0 ) = —T . (5.73a) 

n 

Now we may argue (rather convincingly :-) that since this relation does not depend on oscillator’s 

1 /9 

parameters m and k, and hence its eigenfrequency at o = (/dm) , it should be valid at any (but 


23 At this stage we restrict our analysis to random, stationary processes q(t), so that Eq. (57) is valid for this 
variable as well, if the averaging is understood in the ((. . .)) sense. 

24 See, e.g. MA Eq. (6.5a). 
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No dissipation 
without 
fluctuations 


sufficiently low, cor c « 1) frequency. Using Eq. (58) with > 0, it may be rewritten as a formula for 
the effective low-frequency viscosity: 


d = 


1 00 

-jX(r)/r 
1 0 


1 CO 

A 


(5.73b) 


Relation (73) reveals an intimate, fundamental connection between fluctuations and dissipation 
provided by a thermally-equilibrium environment. Verbally, “there is no dissipation without 
fluctuations” - and vice versa. 25 Historically, this fact was first recognized in 1905 by A. Einstein, 26 in 
the following fonn. Let us apply our result (73) to the particular case of a free ID Brownian particle, by 
taking k = 0. In this case both equations (71) and (72) give infinities. In order to understand the reason 
for that divergence, let us go back to the Langevin equation (65) with not only k = 0, but also, just for 
the sake of simplicity, m — > 0 as well. (The latter approximation, frequently called the overdamping 
limit, is quite appropriate for the motion of a small particle in a viscous fluid, when m « r/A t even for 
smallest time intervals At between the successive observations of particle’s positions.) In this 
approximation, Eq. (65) is reduced to a simple equation, 

dd = -fact (0 + 7(0 , (5-74) 


with a ready solution for particle displacement during a finite time interval t: 

Aq(t) = q(t)-q(0) = ((Aq(t))) + q(t\ ((A q(t))) = -\ f Aet {t')df , Aq(t) = -\?(t')df . (5.75) 

do do 

Evidently, in the statistical average of the displacement, the fluctuation effects vanish, but this 
does not mean that the particle does not deviate from the deterministic trajectory (( q(t ))) - just that is has 
equal probabilities to be shifted either of two possible directions from that trajectory. To see that, let us 
calculate the variance of the displacement: 

((A q 2 ( 0 )) = — J dt'\ dt"(f(t’)f (t")) = — J dt' | dt"K r (t'-t") . (5.76) 

d 00 d 00 


As we already know, at times r» r c (this correlation time, for typical molecular impacts, is of the order 
of a picosecond), correlation function may be well approximated by the delta-function - see Eq. (62). In 
this approximation, with S){0) expressed by Eq. (73), and Eq. (80) yields 



r\ t t rji t t 

~^S r (0)J dt'\ dt"S(t -t') = -^—\dt’\ dt"8(t -t') = 2Dt, 

d 00 d n 0 0 


(5.77) 


with 


25 This means that the phenomenological description of dissipation by bare viscosity in classical mechanics (see, 
e.g., CM Sec. 4.1) is only valid approximately, when the energy scale of the process is much larger than T. 

26 It was published in one of the three papers of Einstein’s celebrated 1905 “triad”. As a reminder, another paper 
started the (special) relativity, and one more was the quantum description of photoelectric effect, essentially the 
prediction of light quanta - photons, which essentially started quantum mechanics. (Not too bad for one year!) 
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(5.78) 


Einstein’s 

relation 


The final fonn of Eq. (77) describes the well-known law of diffusion (“random walk”) of a ID 
system, with the r.m.s. deviation from the point of origin growing as (2 Dt) . Coefficient D is this 
relation is called the coefficient of diffusion, and Eq. (78) describes the extremely simple Einstein 
relation between that coefficient and particle’s damping. Often this relation is rewritten in SI units of 
temperature as D = /u m k B T K , where /u m = \lrj is the mobility of the particle. The physical sense of /u m 
becomes clear from rewriting the expression for the deterministic viscous motion (( q(t ))) (particle’s 
“drift”) in the fonn: 


v 


drift 


d{(q(t))) 

dt 


-4ft(0 = /C/det(0, 
V 


(5.79) 


so that mobility is just velocity given to the particle by unit force. 27 

Another famous example of application of Eq. (73) is to the thermal (or “Johnson”, or “Johnson- 
Nyquist”, or just “Nyquist”) noise in resistive electron devices. Let us consider a two-terminal “probe” 
circuit, playing the role of the harmonic oscillator in our analysis above, connected to a resistor R (Fig. 
9), playing the role of noisy environment. (The noise is generated by the thermal motion of numerous 
electrons, randomly moving inside the resistor.) For this system, one convenient choice of conjugate 
variables (the generalized coordinate and generalized force) is, respectively, the electric charge Q = 
\l(t)dt that has passed through the “probe” circuit by time t, and voltage V across its terminals, with the 
polarity shown in Fig. 9. (Indeed, product VdQ is indeed the elementary work dw 2 done by the 
environment on the probe circuit.) 


1 

▼ 


A 


+ 

V 



Fig. 5.9. Resistor R of temperature T as a noisy 
environment of a two-terminal probe circuit. 


Making the corresponding replacements, q — » Q and A in Eq. (64), we see that it becomes 

rj Q = -ql = (V) . (5.80) 

Comparing this relation with Ohm’s law, R(-I) = we see that in this case, coefficient r/ has the 
physical sense of the usual Ohmic resistance R, 29 so that Eq. (73) becomes 


27 In solid-state physics and electronics, mobility is more frequently defined as find <?\= e|Vdrift/fdet| (where is 
the applied electric field), and is traditionally measured in cmVV-s. In these units, the electron mobility in silicon 
wafers used for integrated circuit fabrication (i.e. the solid most important for engineering practice) at room 
temperature is close to 10 3 . 

28 The minus sign is due to the fact that in our notation, current through the resistor equals (-7) - see Fig. 9. 

29 Due to this fact, Eq. (64) is often called the Ohmic model of the environment response, even if the physical 
nature of variables q and f is completely different from the electric charge and voltage. 
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S v (co) = —T. (5.81a) 

n 

Using Eq. (61), and transferring to the SI units of temperature ( T — > k B T K ) , we can bring this famous 
Nyquist formula 30 to its most popular form 

(5.81b) 

Note that according to Eq. (65), this result is only valid at a negligible speed of change of the 
generalized coordinate q (in this case, negligible current I), i.e. Eq. (81) expresses the voltage 
fluctuations as would be measured by an ideal voltmeter, with an input resistance much higher that R. 

On the other hand, applying a different choice of generalized coordinate and force, q — > ®, y 2 — > / 
(where ® = j lAf)dt is the generalized magnetic flux, so that d'^ = IdO), we get r/ — > HR, and Eq. (73) 
yields the thermal fluctuations of the current through the resistor (as measured by an ideal ammeter, i.e. 
at V — » 0): 


Nyquist 

formula 



S,(oj) 


1 

nR 


i.e. 



4 k B T K 
— 5-^Av. 
R 


(5.81c) 


Note that Eqs. (81) as valid for noise in thermal equilibrium only. In electric circuits, which may 
be readily driven out of equilibrium by applied voltage (V), other types of noise are frequently 
important, notably the shot noise, which arises in short conductors, e.g., tunnel junctions, at applied 
voltages (V) » T /q, due to the discreteness of charge carriers. 31 A straightforward analysis using a 
simple model, described in the assignment of Exercise Problem 9, shows that this noise may be 
characterized by current fluctuations with low-frequency spectral density 


(5.82) 

where q is the electric charge of a single current carrier. This is the Schottky formula, valid for any 
relation between / and V. Comparison of Eqs. (81c) and (82) for a device that obeys the Ohm law shows 
that the shot noise has the same intensity as the thermal noise with effective temperature 


Schottky 

formula 



T = 


\qV\ 


» T . 


(5.83) 


This relation may be interpreted as a result of charge carrier overheating by the applied electric field, 
and explains why the Schottky formula (82) is only valid in conductors much shorter than the energy 


30 Named after H. Nyquist who derived this formula in 1928 (independently of the prior work by A. Einstein, M. 
Smoluchowski, and P. Langevin) to describe the noise which had been just discovered experimentally by his Bell 
Labs’ colleague J. B. Johnson. The derivation of Eq. (73) and hence Eq. (81) in these notes is essentially a twist of 
the derivation used by Nyquist. 

31 Another practically important type of fluctuations in electronic devices is the low-frequency 1/f noise which 
was already mentioned in Sec. 3 above. I will briefly discuss it in Sec. 8. 
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relaxation length l e of the charge carriers. 32 Another mechanism of the shot noise suppression, that 
becomes noticeable if system’s transparency is high, is the Fermi-Dirac statistics of electrons. 33 

Returning to the bolometric Dicke radiometer (see Figs. 6-7 and their discussion), we may now 
use the Langevin equation formalism to finalize its analysis. For this system, the Langevin equation is 
just the usual equation of heat balance: 

C v ^ + #(T-T 0 ) = P d Jt) + P{t) , (5.84) 

at 

where P& A = (P) describes the (detenninistic) power of absorbed radiation, and P presents the effective 
source of temperature fluctuations. Now we can use Eq. (84) to carry out a calculation of the spectral 
density Sj(co) of temperature fluctuations absolutely similar to how this was done with Eq. (65), 
assuming that the frequency spectrum of the fluctuation source is much broader than the intrinsic 
bandwidth 1/r = f/C v of the bolometer, so that its spectral density at frequencies cor- 1 may be well 
approximated by its low-frequency value SV(0): 

2 

^(®)=— ! MO)- (5-85) 

iCOCy + ^ 

Then, requiring the variance of temperature fluctuations, 

(SI rf = (f- '■} = 2|s r (®)dffl= 2S„(0)| 4 <to=2s,(o)-Ef ; d JC ' 

0 0 ICOCy + Cy 0 CO +($■/ Cy ) tyC y 

to coincide with our earlier “thermodynamic fluctuation” result (41), we get 

S^iO) = ^T 0 2 . (5.87) 

n 

The r.m.s. value of the “power noise”'? 3 within bandwidth An « 1/r (Fig. 7) becomes equal to the 
deterministic signal power P &, t (or more exactly, the main harmonic of its modulation law) at 

n*. =(P}J' 2 =(«,(0)A®r =2(^iv) ,/2 r 0 . (5.88) 

This result shows that our earlier prediction (45) may be improved by a substantial factor of the 
order of (Av/v) ", where the reduction of the output bandwidth is limited only by the signal 
accumulation time At - 1/A v, while the increase of v is limited by the speed of (typically, mechanical) 
devices performing the power modulation. In practical systems this factor may improve the sensitivity 
by a couple orders of magnitude, enabling observation of extremely weak radiation. Maybe the most 
spectacular example are the recent measurements of the CMB radiation (discussed in Sec. 2.6), which 
corresponds to blackbody temperature 7k « 2.725 K, with accuracy ST K - 10’ 6 K, using microwave 


32 See, e.g., Y. Naveh et al., Phys. Rev. B 58, 15371 (1998). In practically used metals, l e is of the order of 30 nm 
even at liquid helium temperatures (and even shorter at ambient conditions), so that the usual “macroscopic” 
resistors do not exhibit the shot noise. 

33 For a review of this effect see, e.g., Ya. Blanter and M. Biittiker, Phys. Repts. 336, 1 (2000). 
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receivers with physical temperature of all their components much higher than ST. The observed weak 
(~10‘ 5 K) anisotropy of the CMB radiation is a major experimental basis of all modern cosmology. 

Let me also note that Eq. (73) may be readily generalized to the case when environment’s 
response is different from the Ohmic model (64). This generalization is virtually evident from Eq. (66). 
Indeed, the second term in its left-hand part is just the Fourier component of the average response of the 
environment: 

{? <a ) = i( °Vqa > - (5-89) 

Let the environment’s response be still linear, but have an arbitrary dispersion, 

= (5.90) 


where the function yf(o), called the generalized susceptibility of the environment, may be complex, i.e. 
have both the imaginary and real parts: 

z(a>) = /(») + • (5.91) 


Then Eq. (73) remains valid 34 with the replacement r/ — » y ”(oS)l co : 


S f (to) = 


rM T . 

nco 


(5.92) 


This fundamental relation 35 is used not only to calculate the fluctuation intensity from the known 
generalized responsibility (i.e. the deterministic response of a complex system to a small perturbation), 
but sometimes in the opposite direction - to calculate the linear response from the known fluctuations. 
(The latter use is especially attractive at numerical simulations, such as molecular dynamics approaches, 
because it allows to avoid filtering a weak response from the noisy background.) 


Heisenberg- 

Langevin 

equation 


Now let us discuss what generalization of Eq. (92) is necessary to make that fundamental result 
suitable for arbitrary temperatures, T ~ hco. The calculations we had perfonned started from the 
apparently classical equation of motion, Eq. (63). However, quantum mechanics shows 36 that a similar 
equation is valid for the corresponding Heisenberg-picture operators, so that repeating all arguments 
leading to the Langevin equation (65), we may write its quantum-mechanical version 


mq + rjq + /cq = -f det +? . 


(5.93) 


34 Reviewing the calculations leading to Eq. (73), we may see that if the possible real part y\co) of the 
susceptibility just adds up to (k - mof) in the denominator of Eq. (67), resulting in a change of oscillator’s 
eigenffequency. This renormalization is insignificant if the oscillator-to-environment coupling is weak, i.e. 
susceptibility y(co) small, as had been assumed at the derivation of Eq. (69) and hence Eq. (73). 

35 It is sometimes called the Green-Kubo (or just “Kubo”) formula. This is hardly fair, because, as the reader 
could see, Eq. (92) is just an elementary generalization of the Nyquist formula (81). Moreover, the corresponding 
works of M. Green and R. Kubo were published, respectively, in 1954 and 1957, i.e. after the 1950 paper by H. 
Callen and T. Welton, where a more general result (see below) had been derived. More adequately, the Green / 
Kubo names are associated with a related relation between the response function and the operator commutator - 
see, e.g., QM Eq. (7.109). 

36 See, e.g., QM Sec. 4.6. 
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This is the so-called the Heisenberg-Langevin (or “quantum Langevin”) equation - in this particular 
case, for a harmonic oscillator. 

The further operations, however, require certain caution, because the right-hand part of the 
equation is now an operator, and has some nontrivial properties. For example, the “values” of the 
Heisenberg operator, representing the same variable^) at different times, do not necessarily commute: 


fit), fit') 


^ 0 , 


(5.94) 


As a result, the function defined by Eq. (46) may not be an even function of time delay r = t’ — t even 
for a stationary process, making it inadequate for representation of the real correlation function - which 
has to obey Eq. (51). This technical difficulty may be circumvented by the introduction of the following 
symm etrized correlati on function 

K f (r) = | {f(t)f(t + r) + f{t + r )/(o) - \ ^{/(0, f(t + r)^j , (5.95) 


(where denotes the anticommutator of the two operators), and, similarly, the symmetrical 

spectral density S/co), defined by relation 

S,(a>)S(a>-at) = + fSL) = j) . (596) 

with K/z) and S/cd) still related by the Fourier transform (59). 37 

Now we may repeat all the analysis that was carried out for the classical case, and get Eq. (71) 
again, but this expression has to be compared not with the equipartition theorem (72), but with its 
quantum-mechanical generalization (2.78), which, in our current notation, reads 



hco 0 

2k 


coth 


fl(D 0 

2 T ' 


(5.97) 


As a result, we get the following quantum-mechanical generalization of Eq. (92): 


S f (G>) = 


2 Tt 


. hco 
coth — . 
2 T 


(5.98) 


This is the much-celebrated fluctuation-dissipation theorem, frequently referred to just as FDT. 38 


As natural as it seems, this generalization poses a very interesting conceptual dilemma. Let, for 
the sake of clarity, temperature be relatively low, T « hco; then Eq. (98) gives a temperature- 
independent result 





(5.99) 


37 Please note that here (and to the end of this section) brackets (...) mean r/nantu/n-statistical averaging (2.12). 
As was discussed in Sec. 2.1, for a classical-mixture state of the environment, this does not create any difference 
in either mathematical treatment of the averages or their physical interpretation. 

38 It was first derived in 1951 by H. Callen and T. Welton (in a somewhat different way). One more derivation of 
the FDT, which gives the Kubo formula as a by-product, may be found in QM Sec. 7.4. 
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which is frequently called the quantum noise. According to the quantum Langevin equation (93), 
nothing but these fluctuations of the force exerted by the environment, with spectral density proportional 
to the imaginary part of susceptibility (i.e. damping), are the source of the ground-state “fluctuations” of 
the coordinate and momentum of a quantum harmonic oscillator, with r.m.s. values 


, 1/2 


5q = Uq 


1/2 


2 met). 


Sp = ((p 


0 J 


1/2 


= mco 0 Sq = 


ftmeo n 


. 1/2 


i.e. 5q-8p = ^, (5.100) 


and average energy fiaxJ 2. On the other hand, the basic quantum mechanics tells us that exactly these 
formulas describe the ground state of a dissipation-free oscillator, not coupled to any environment, and 
are a direct corollary of the Heisenberg uncertainty relation 

Sq-Sp> |. (5.101) 


(The Gaussian wave packets, pertinent to a harmonic oscillator’ ground state, turn the sign in Eq. (101) 
into pure equality.) So, what is the genuine source of Eqs. (100)? 

The resolution of this paradox is that either interpretation of Eqs. (100) is legitimate, with their 
relative convenience depending on the particular application. (One can say that since the right-hand part 
of the quantum Langevin equation (93) is a quantum-mechanical operator, rather than a classical force, 
it “carries the uncertainty relation within itself’.) However, this opportunistic resolution leaves the 
following question open: is the quantum noise (99) of the environment observable directly, without any 
probe oscillator subjected to it? An experimental resolution of this dilemma is not quite simple, because 
usual scientific instruments have their own zero-point fluctuations, which may be readily confused with 
those of the system under study. Fortunately, this difficulty may be overcome, for example, using unique 
frequency-mixing (“down-conversion”) properties of Josephson junctions. 39 Special low-temperature 
experiments using such down-conversion 40 have confirmed that noise (99) is real and measurable. This 
has been one of the most convincing direct demonstrations of the reality of the zero-point energy ft co/2. 41 

Finally, let me mention briefly an alternative derivation 42 of the fluctuation- theorem from the 
general quantum mechanics of open systems. This derivation is substantially longer, but gives an 
interesting sub-product, 


Htmt + r) 


= ihfij) , 


(5.102) 


where ^(r) is the temporal Green’s function of the environment (as “seen” by the system subjected to 
the generalized force ?), defined by equation 


co t 

{fit)) = J fi*)qit ~ r)dr = J f(t - t')q(t')dt' . 

0 —oo 


(5.103) 


39 K. L ik harev and V. Semenov, JETP Lett. 15 , 442 (1972). 

40 R. Koch et al., Phys. Rev. B 26 , 74 (1982). 

41 Another one is the Casimir effect - see, e.g., QM Sec. 9.1. 

42 See, e.g., QM Sec. 7.4. 
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Plugging the Fourier transforms of all three functions participating in Eq. (103) into that relation, it is 
straightforward to check 43 that the Green’s function is just the Fourier image of the complex 
susceptibility defined by Eq. (90): 

oo 

\f(r)e i(OT dT = z((o)- (5.104) 

o 

here 0 is used as a lower limit instead of (-oo) just to emphasize that due to the causality principle, the 
Green’s function has to be equal zero for r < 0. 

In order to reveal the real beauty of Eq. (102), we may use the Wiener-Khinchin theorem (59) to 
rewrite the fluctuation-dissipation theorem (98) in a form similar to Eq. (102): 

l^{t),h + T)}j = 2Kf{T), (5.105) 

where the correlation function r) is most simply described by its Fourier transfonn, equal to 7iS/co): 

J K r (r) cos cor dr = — coth — . (5.106) 

o 2 2 T 

The comparison of Eqs. (102) and (104), on one hand, and Eqs (105)-(106), on the other hand, 
shows that both the commutation and anticommutation properties of the Heisenberg-Langevin force 
operator at different moments of time are determined by the same generalized susceptibility x(cb), but 
the average anticommutator also depends on temperature, while the average commutator does not. 44 


5.6. The Kramers problem and the Smoluchowski equation 

Returning to the classical case, it is evident that the Langevin equation (65) provides the means 
not only for the analysis of stationary fluctuations, but also for the description of an arbitrary time 
evolution of (classical) dynamic systems coupled to their environment - which, again, provides both 
dissipation and fluctuations. However, this approach suffers from two major handicaps. 

First, this equation does enable us to find the statistical average of variable q, and the variance of 
its fluctuations (i.e., in the common mathematical terminology, the first and second moments of the 
probability distribution) as functions of time, but not the distribution w(q, t ) as such. This may not look 
like a big problem, because in most cases (in particular, in linear systems such as the harmonic 
oscillator) the distribution is Gaussian - see, e.g., Eq. (2.77). 

The second, more painful, drawback of the Langevin approach is that it is instrumental only for 
the already mentioned “linear” systems - i.e., the systems whose dynamics is described by linear 
differential equations, such as Eq. (65). However, as we know from classical dynamics, many important 
problems (for example, the Kepler problem of planetary motion 45 ) are reduced to ID motion in 
substantially anhannonic potentials U e t{q), leading to nonlinear equations of motion. If the energy of 
interaction between the system and its random environment is bilinear - i.e. is a product of variables 


43 See, e.g., CM Sec. 4.1, part (ii). 

44 Only explicitly so, because the complex susceptibility of the environment may depend on temperature as well. 

45 See, e.g., CM Sec. 3.4-3.6. 
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belonging to these sub-systems (as it is very frequently the case), we may repeat all arguments of the 
last section to derive the following generalized version of the Langevin equation 

mq + r/q + ^ = /(?), (5.107) 

dq 

valid for an arbitrary, possibly time-dependent potential U(q, /). 46 Unfortunately, the solution of this 
equation may be very hard. Indeed, the Fourier analysis carried out in the last section was essentially 
based on the linear superposition principle that is invalid for nonlinear equations. 

If the fluctuation intensity is low, \Sq I « (q), where (q)(t) is the deterministic solution of Eq. 
(107) in the absence of fluctuations, this equation may be linearized 47 with respect to small fluctuations 
q = q-(q) to get a linear equation, 


mq +77 q + x(t)q = f(t), with x{t) = -^-U((q)(t),t). 

dq~ 


(5.108) 


This equation differs from Eq. (65) only by the time dependence of the effective spring constant x{t), 
and may be solved by the Fourier expansion of both fluctuations and function x(t). Such calculations are 
somewhat more cumbersome than have been performed above, but may be doable (especially if the 
unperturbed motion (q)(t) is periodic), and sometimes give useful analytical results. 48 

However, some important problems cannot be solved by the linearization. Perhaps, the most 
apparent example is the so-called Kramers problem 49 of finding the lifetime of a metastable state of a 
ID classical system in a potential well separated from the continuum motion region with a potential 
barrier (Fig. 10). 



Fig. 5.10. The Kramers problem. 


In the absence of fluctuations, the system, placed close to well’s bottom (q = q 1 ), would stay 
there forever. Fluctuations result not only in a finite spread of the probability density w(q, t ) around that 
point, but also in the gradual decrease of the total probability 

W(t)= Jw(q,t)dq (5.109) 

well's 

bottom 


46 The generalization of Eq. (107) to higher spatial dimensionality is also straightforward, with the scalar variable 
q replaced by vector q, and the scalar derivative dUldq replaced with vector VU. 

47 See, e.g., CM Secs. 3.2, 4.2, and beyond. 

48 See, e.g., Chapters 5 and 6 in W. Coffey et al., The Langevin Equation, World Scientific, 1996. 

49 After H. Kramers who, besides solving this important problem in 1940, has made significant contributions to 
many other areas of physics, including the famous Kramers-Kronig dispersion relations - see, e.g., EM Sec. 7.4. 
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to find the system in the well, because of the growing probability of escape from the well, over the 
potential barrier, due to thermal activation. If the barrier height, 

U 0 =U(q 2 )-U( qi ), (5.110) 


is much larger than temperature T , 50 the Boltzmann distribution w oc exp{ -U(q)/T} should be 
approximately valid in most of the well, so that the probability for the system to overcome the barrier 
should scale as exp {-Uo/T}. From these handwaving arguments, one may reasonably expect that if 
probability W(t) that the system is still in the well by time t should obey the usual “decay law” 

W 

W = , (5.111) 

r 


then lifetime rhas to obey the general Arrhenius law, r = t a cxp { UfT\. However, that relation needs to 
be proved, and the pre-exponential coefficient r A (frequently called the attempt time ) needs to be 
calculated. This cannot be done by the linearization of Eq. (107), because the linearization is equivalent 
to a quadratic approximation of the potential U(q), which evidently cannot describe the potential well 
and the potential barrier simultaneously - see Fig. 10. 

This and other essentially nonlinear problems may be addressed using an alternative approach to 
fluctuation analysis, dealing directly with the time evolution of the probability density w(q,t). Due to the 
shortage of time, I will review this approach a bit superficially, using mostly handwaving arguments, 
and refer the interested reader to special literature 51 for strict mathematical proofs. Let us start from the 
effect of diffusion of a free ID particle in the high damping limit, described by the Langevin equation 
(74), and assume that at all times the probability distribution stays Gaussian: 


w(q,t) 


1 

(2 7i) ul Sq(t) 


exp 


(d ~ do ) 2 1 

2 Sq 2 (t) y 


(5.112) 


where qo is the initial position of the particle, and deft) is the time-dependent distribution width, which 
grows in time in accordance with Eq. (77): 

Sq{t) = {lDtj n . (5.113) 


It is straightforward to check, by substitution, that this solution satisfies the following simple partial 
differential equation, 52 


dw _ d 2 w 


(5.114) 


with the delta-functional initial condition 


w(q,0) = S(q - q 0 ) . 


(5.115) 


50 If U () is comparable with T, system’s behavior also depends substantially on the initial probability distribution, 
i.e., do not follow the universal law (1 1 1). 

51 See, e.g., either R. Stratonovich, Topics in the Theory’ of Random Noise, vol. 1., Gordon and Breach, 1963, or 
Chapter 1 in the monograph by W. Coffey et al., cited above. 

52 By the way, the goal of the traditional coefficient 2 in Eq. (77) is exactly to have the fundamental Eq. (114) free 
of numerical coefficients. 
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The simple and important equation of diffusion (114) may be naturally generalized to the 3D motion: 53 

Equation 
of 3D 
diffusion 


Now let us compare this equation with the probability conservation law, 54 

dw 

— + V-j w =0, (5.117a) 

dt 


dt 


(5.116) 


where vector j w has the physical sense of the probability current density. (The validity of this relation is 
evident from its integral form, 


d_ 

dt 



| j w ' d 2 r = 0 , 


5 


(5.117b) 


that results from integration of Eq. (117a) over an arbitrary time-independent volume V limited by 
surface S, and applying the divergence theorem 55 to the second term.) The continuity relation (117a) 
coincides with Eq. (116), with D given by Eq. (78), only if we take 

j w = -DVw = -—Vw. (5.118) 

V 


The first fonn of this relation allows a simple interpretation: the probability flow is proportional to the 
spatial gradient of probability density (i.e., in application to many (N) similar and independent particles, 
just to the gradient of their concentration n = Nw), with the sign corresponding to the flow from the 
higher to lower concentration. This flow is the very essence of the effect of diffusion. 

The fundamental Eq. (117) has to be satisfied also for a force-driven particle at negligible 
diffusion (D — > 0); in this case 

j„. =wv, (5.119) 


where v is the deterministic velocity of the particle. In the high-damping limit we are considering right 
now, v is just the drift velocity: 

v = -4,=--Vh( r), (5.120) 

7 V 

where fdet is the deterministic force described by potential energy U(r). Now, as we have descriptions of 
j w due to both drift and diffusion separately, we may rationally assume that in the general case when 
both effects are present, the corresponding components of the probability current just add up, so that 

jw=-k-Vt/)-7Vw], (5.121) 

7 


53 As will be discussed in Chapter 6, the equation of diffusion also describes several other physical phenomena - 
in particular, heat propagation in a uniform, isotropic solid, and in this context is called the heat conduction 
equation or (rather inappropriately) just the “heat equation”. 

54 Both forms of Eq. (117) are similar to the mass conservation law in classical dynamics (see, e.g., CM Sec. 8.2), 
and the electric charge conservation law in electrodynamics (see, e.g., EM Sec. 4.1). 

55 See, e.g., MA Eq. (12.2), 
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and Eq. (117a) takes the form 


7 ^ = V(>vVt/) + T\ 2 w. 


(5.122) 


This is the Smoluchowski equation , 56 which is closely related to the Boltzmann equation in multi- 
particle kinetics - to be discussed in the next chapter. 


As a sanity check, let us see what does the Smoluchowski equation give in the stationary limit, 
dw/dt — > 0 (which evidently may be achieved only if the deterministic potential U is time-independent.) 
Then Eq. (1 17a) yields j w = const, where the constant describes the motion of the system as the whole. If 
such motion is absent , \ w = 0, then according to Eq. (121), 


wVU + TVw = 0 , 


Vw 

i.e. 

w 


VU 

T 


(5.123) 


Since the left-hand part of the last form of the last relation is just V(lnvv), Eq. (123) may be immediately 
integrated, giving 

lnw = ~~ + InC, i.e. w(r) = C expj- j , (5.124) 


Multiplied by the number N of similar, independent systems, with spatial density n( r) = Nw(r), this is 
just the Boltzmann distribution (3.26). 


Now, as a less trivial example of the Smoluchowski equation’s applications, let us use it to solve 
the ID Kramers problem (Fig. 10) in the corresponding high-damping limit, m « rjTA . It is 
straightforward to check that the ID version of Eq. (121), 


is equivalent to 


K = 1 

7 


w 


r dU ' 


dq 


-T — 
dq 




B_ 

dq 


w exp 


U{q) 

T 


> 


(5.125a) 


(5.125b) 


(where I w is the probability current at a certain location q, rather than its density), so that we can write 


K ex P- 


'U(q) 1 T d 

. T j rj dq 


wcxp 




(5.126) 


As was discussed above, the notion of metastable state’s lifetime is well defined only for sufficiently 
low temperatures 

T «U 0 . (5.127) 


56 Named after M. Smoluchowski who developed this formalism in 1906, apparently independently from the 
slightly earlier Einstein’s work, and in much more detail. This equation has important applications in many fields 
of science, including such surprising topics as statistics of spikes in neural networks. 


Smoluchowski 
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when the lifetime is relatively long, r » ta, where Ta has to be of the order of the time of the system 
relaxation inside the well. Since the first term of the continuity equation (117b) is of the order of W/t, in 
this limit the term, and hence the gradient of I w , are negligibly small, so the probability current does not 
depend on q in the potential barrier region. Let us integrate both sides of Eq. (126) over that region, 
using that fact: 



(5.128) 


where the integration limits q ’ and q ” (Fig. 10) are selected so that so that 

T«U(q')-U( qi ),U(q 2 )- U(q ") « U 0 . 


(5.129) 


(Evidently, such selection is only possible if condition (127) is satisfied.) In this limit the contribution to 
the right-hand part from point q” is negligible because the probability density behind the barrier is 
exponentially small. On the other hand, the probability at point q ’ is close to its stationary, Boltzmann 
value (124), so that 


w(q') exp 



w(q ] ) exp 



(5.130) 


and Eq. (128) yields 


T 

V 



U{q)-U{q x ) 

T 



(5.131) 


We are almost done. The probability density w(q\) at the well’s bottom may be expressed in 
tenns of the total probability W of the particle being in the well by using the nonnalization condition 


W = 


f ( \U(qi)-U{q)\ 

J w{q x ) exp-j >dq ; 

well's ^ J 

bottom 


(5.132) 


the integration here may be limited by the region where the difference U(q) - U(q\) is larger then T but 
still much smaller than Uq - cf. Eq. (129). According to the Taylor expansion, the shape of any smooth 
potential well near its bottom may be well approximated by a quadratic parabola: 


k, 


U(q~q 1 )-U(q l )^Y(q~qi) 


where k, = 


d U 


dq~ 


9 = 9 \ 


> 0 . 


(5.133) 


With this approximation, Eq. (132) is reduced to the standard Gaussian integral: 


57 


W = w{q j ) | 


expl 


well's 

bottom 


xM-d i ) 2 
2 T 


\dq * w(q, )\ 


exp) 


K\ q 
2 T 


\dq=w(q l ) 


2 jtr 


V J 


(5.134) 


To complete the calculation, we may use the similar approximation, 


57 If necessary, see MA Eq. (6.9b) again. 
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U(q ~ q 2 )-U(q l ) 


/c. 


utqj-'^q-qif 




-U(q l ) = U 0 -'-^(q-q^ 1 , 


where k 2 =- 


d U 


(5.135) 


dq~ 


q=q 2 > 


to work out the remaining integral in Eq. (131), because in the limit (129) this integral is dominated by 
the contribution from a region very close to the barrier top, where approximation (135) is asymptotically 
exact. As a result, we get 


q" 

J exp 
q' 


U(q)-U(q x ) 


dq « exp 



/ N. 1/2 

<2 xT^ 


V **2 7 


(5.136) 


Plugging Eqs. (136), and w{q\) expressed from Eq. (134), into Eq. (131), we finally get 


1/2 


exp/ 


Ur, 


2kt] 


(5.137) 


This expression should be compared with the ID version of Eq. (117b) for the segment [-oo, q ’]. 
Since this interval covers the region near q\ where most of the probability density resides, and I q {- oo) = 
0, the result is merely 

dW 

— + W) = 0. (5.138) 


In our approximation, I w (q ’) does not depend on the exact position of point q ’, and is given by Eq. (137), 
so that plugging it into Eq. (138), we recover the exponential decay law (111), with lifetime 



= 2^(r 1 r 2 ) 1/2 exp 



(5.139) 


Thus the metastable state lifetime is indeed described by the Arrhenius law, with the attempt 
time scaling as the geometric mean of system’s “relaxation times” near the potential well bottom (ri) 
and the potential barrier top (Z2). 58 Let me leave for reader’s exercise to prove that if the potential profile 
near well’s bottom and/or top is sharp, the pre-exponential factor in Eq. (139) should be modified, but 
the Arrhenius exponent is not affected. 


5.1 . The Fokker-Planck equation 

Expression (139) is just a particular, high-damping limit of a more general result obtained by 
Kramers. In order to recover all of it, we need to generalize the Smoluchowski equation to arbitrary 
values of damping 77. In this case, the probability density w is a function of not only the particle’s 
position q (and time t), but also its momentum p - see Eq. (2.1 1). Thus the continuity equation (117a) 
needs to be generalized to 6D phase space. Such generalization is natural: 


58 Actually, t 2 describes the characteristic time of the exponential growth of small deviations from the unstable 
fixed point q 2 at the barrier top, rather than their decay, as near point q\. 
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dw 

dt 


+ V 


• j + V • j 

q iq pit 


= 0 , 


(5.140) 


where \ q (which was called j M in the last section) is the probability current density in the coordinate 
space, while \ p is the current density in the momentum space, and V /; is the gradient operator in that 
space, 


V - ! U 


(5.141) 


while W q is the usual gradient operator in the coordinate space, that was denoted as V in the previous 
section - with index q added here just for additional clarity. At negligible fluctuations ( T — » 0), \ p in the 
momentum space may be evaluated using the natural analogy with \ q - see Eq. (119). In our new 
notation, that relation takes the following form, 

j = wx = wq = w—, (5.142) 

m 

so it is naturally to take 


j„ = wp = w(f) = w(-V Q U - iff) = w(-V U - 7 — ) . 


m 


(5.143) 


As a sanity check, it is straightforward to verify that the diffusion-free equation resulting from 
the combination of Eqs. (140), (142) and (143), 


dw I 
dt 


= -V 

drift v q 


w- 

V mj 


+ V, 


H 


VU + r/- 


V 


m J 


(5.144) 


allows the following particular solution 

w(q,p,0 = d ( q - (q)(/)Mp “(p>W), (5.145) 

where the statistical-average coordinate and momentum satisfy the deterministic equations of motion, 


V) = -V cP-il 


m 


(5.146) 


describing particle’s drift, with the appropriate deterministic initial conditions. 

In order to understand how the diffusion may be accounted for, let us consider a statistical 
ensemble of free (V q U= 0, 77 — > 0) particles that are uniformly distributed in direct space (so that V q w = 
0), but possibly localized in the momentum space. For this case, the right-hand part of Eq. (144) 
vanishes, i.e. the time evolution of the probability density w may be only due to diffusion. In the 
corresponding limit (f) — > 0, the Langevin equation (107) for each Cartesian coordinate is reduced to 


mq j =7 j (t), i .e.p j =f j (t). 


(5.147) 


This equation is similar to the high-damping ID equation (74) (with = 0), with replacement q — > 

Pjlrj, and hence the corresponding contribution to owlet may be described by the second term of Eq. 
( 122 ) with that replacement: 
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dw I 

^ . diffusion 


= DV 2 pll] w = -rj 2 W 2 p w = rjIV 2 p w . 


n 


(5.148) 


Now the reasonable assumption that in the arbitrary case the drift and diffusion contributions to dw/dt 
just add up, immediately leads us to the full Fokker-Planck equation : 59 


dw f p ^ 

— = -V„ • w— 
dt q \ m j 

+ 

< 

ha 

p ^ 

w V U + 77 — 

V m) 

+ ?lTV 2 pW. 


(5.149) 


As a sanity check, let us use this equation to find the stationary probability distribution of 
momentum of free particles, at arbitrary damping 77 , in the momentum space, assuming their uniform 
distribution in the direct space, W q = 0. In the stationary case dw/dt = 0, so that Eq. (149) is reduced to 



f „ VI 


p 

w 

V — 


V rnj J 


+ rjTV 2 p w = 0 . 


(5.150) 


The damping coefficient 77 cancels, and the first integration over momentum yields 


— w+TV p w= j, 


m 


(5.152) 


where j is a vector constant describing a possible motion of the system as the whole. In the absence of 
such motion, j = 0 , the second integration over momentum gives 


w = const x expt - 


2mT 


(5.153) 


i.e. the Maxwell distribution (3.5). However, result (153) is more general than that obtained in Sec. 3.1, 
because it shows that the distribution stays the same even at nonvanishing damping. 

It is also easy to show that if the damping is large (in the sense assumed in the last section), the 
solution of the Fokker-Plank equation tends to the following product 

r p 2 1 

vt’(q,p,f) — » const x expj >x w(q,t) , (5.154) 

2mT 


where the direct-space distribution w(q,/) obeys the Smoluchowski equation (122). However, in the 
general case, solutions of Eq. (149) may be rather complex , 60 so I would mention (rather than derive) 
only one of them, that of the Kramers problem (Fig. 10). Acting virtually exactly as in Sec. 6 , one can 
show at arbitrary damping (but still in the limit (127), T « Uo, with the additional restriction r » m/f), 
the metastable state’s lifetime is again given by the Arrhenius formula (139), with the same exponent 
Qxp{Uo/T}, but with the reciprocal time constants 1/ zy 2 replaced with 


, -11/2 


® 1,2 — 


+ 


T] 

y 2m y 


7 

2m 


fcn 12 , for 77 « mco x 2 , 
jl/r 12 , for 777 U 7 ) 2 « 77 , 


(5.155) 


59 It was derived in 1913 in A. Fokker’s PhD thesis work; M. Planck was his thesis adviser. 

60 The reader should remember that these solutions embody, as the particular case T= 0, all classical dynamics of 
a particle. 
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1 /? 

where &>i ,2 = (/r 1 , 2 /m ) , while /r i , 2 are the effective spring constants defined by Eqs. (133) and (135). 
Thus, in the most important particular limit of low damping, Eq. (139) is replaced with the famous 
formula 

Kramers 
formula 
for low 
damping 

This Kramers’ result for the classical thermal activation of the virtually-Hamiltonian system over 
the potential barrier may be compared with that for its quantum-mechanical tunneling through the 
barrier. 61 Even the simplest, WKB approximation for the latter time, 


2 7X 

\u 0 ] 


T - / y /2 ex P] 

l® i® 2 ) 

[T J 


(5.156) 


r Q = x A exp< 


2 J tc(q)dq 

K 2 (q)>0 


with - - ^ =U(q)-E, 
2m 


(5.157) 


shows that generally those two lifetimes have different dependences on the barrier shape. For example, 
for a nearly-rectangular potential barrier, the exponent that determines the classical lifetime (156) 
depends (linearly) only on the barrier height Uo, while that defining the quantum lifetime is proportional 
to the barrier width, while scaling as a square root of Uo. However, in the important case of “soft” 
potential profiles, which are typical for the case of barely emerging (or nearly disappearing) quantum 
wells (Fig. 1 1) the classical and quantum results may be simply related. 



Fig. 5.11. Cubic-parabolic potential 
profile and its parameters. 


Indeed, such potential profile U{q) may be well approximated by 4 leading terms of its Taylor 
expansion, with the highest tenn proportional to ( q - c/o)\ near some point q 0 in the vicinity of the well. 
In this approximation, the second derivative d Uldq vanishes at the point qo = (cj\ + qi)!2, exactly 
between the well’s bottom and the barrier’s top (in Fig. \ q\ and qi)- Selecting the origin at this point, 
we may reduce the approximation to just two terms: 62 

U(q) = aq-^q 3 , (5.158) 

with ab > 0. Using a straightforward calculus, we can find all important parameters of this cubic- 
parabola: the positions of its minimum and maximum: 

q 2 =- qi =(a/b) V2 , (5.159) 

the barrier height over the well’s bottom: 


61 See, e.g., QM Secs. 2. 3-2.4. 

62 As a reminder, an absolutely similar approximation is used in Exercise Problem 4.3 for the P{V) function, in 
order to analyze properties of the van der Waals model near the critical temperature. 
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1/2 


U 0 =U(q 2 )-U( qi ) = - 


V b J 


and the effective spring constants: 


K x =K 2 


d 2 U 



= 2 (abf 2 . 


(5.160) 


(5.161) 


The last expression shows that for this potential profile, frequencies (0\i participating in Eq. 
(161) are equal to each other, so that this result may be rewritten as 


Thermal 
and quantum 
lifetimes 
in soft 
potential 
well 


2 7t 


U , 


t = — exp« — k with cOq = 


2{ab) 


1/2 


OJ n 


m 


(5.162) 


On the other hand, for the same profile, the WKB approximation (157) (which is accurate when the 
height of the metastable state energy over the well’s bottom, E - U(q \ ) * fico d/ 2, is much less than the 
barrier height Uo) yields 63 


r e = 


2 7C 

COa 


h(o 0 

864;r U, 


, 1/2 


expl 


0 J 


36 U 0 
5 fi<x> n 


(5.163) 


Comparison of the dominating, exponential factors in these two results shows that the thermal 
activation yields lower lifetime (i.e., dominates the metastable state decay) if temperature is above the 
crossover value 


T c = — h(o 0 =7.2 hco 0 . (5.164) 

This expression for the cuh/c-parabolic barrier may be compared with the similar crossover for a 
quadratic-parabolic barrier, 64 for which T c = 2tt tioX) ~ 6.28 fico o. We see that the numerical factors for 
these two different soft potential profiles are very substantial, but rather close. 


5.8. Back to the correlation function 

Unfortunately I will not have time to review solutions of other problems using the 
Smoluchowski and Fokker-Planck equations, but have to mention one conceptual issue. Since it is 
intuitively clear that these equations provide the complete statistical information about the system under 
analysis, one may wonder whether they may be used to find the temporal characteristics of the system, 
which were discussed in Secs. 4-5 using the Langevin formalism. For any statistical average of a 
function taken at the same time instant, the answer is evidently yes - cf. Eq. (2. 1 1): 

(/(q(0»p(0)) = J/(q»pMq»p,0<* V 3 /u (5.165) 


63 The main, exponential factor in this result may be obtained simply by ignoring the difference between E and 
U(q\), but the correct calculation of the pre-exponent requires to take this difference, %C0qI2, into account - see K. 
Likharev, Physica B 108 , 1079 (1981). 

64 See, e.g., QM Sec. 2.4. 
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Correlation 
function of 
discrete- 
state 
system 


but what if the function depends on variables taken at different times, for example the components of the 
correlation function Kjz) defined by Eq. (49)? 

To answer this question, let us start from the discrete variable case when Eq. (165) takes form 
(2.7), which, for our current purposes, may be rewritten as 

(/(')) = (5. 166) 

m 


In plain English, this is a sum of all possible values of the function, each multiplied by its probability as 
a function of time. But this means that average f[t)f[tj) may be calculated as the sum of all possible 
products ftfn \ multiplied by the joint probability for measurement outcome m at moment t, and outcome 
in ’ at moment t’. The joint probability may be presented as a product of W m (t) by the conditional 
probability W(m’, t’\ m, t ). Since the correlation function is well defined only for stationary systems, in 
the last expression we can take t = 0, i.e. find the conditional probability as the result, W m {z), of solution 
of the equation describing system’s probability evolution, at time z = t’ - t (rather than tj, with the 
special initial condition 

WA0) = S m , m . (5.167) 


On the other hand, since the average {ff)f{t + r)) of a stationary process should not depend on t, instead 
of W m (t) we may take the stationary probability distribution W m (cc), independent of the initial 
conditions, and may be found as the same special solution, but at time r— > oo. As a result, we may write 


m,m' 


(5.168) 


This expression looks simple, but note that this recipe requires to solve the time evolution 
equations for each W m (z) for all possible initial conditions (167). To see how this recipe works in 
practice, let us revisit the simplest two-level system (see, e.g., Fig. 4.13 reproduced in Fig. 12 below in a 
notation more convenient for our current purposes), and calculate the correlation function of its energy 
fluctuations. 


Wft) 

a 



i 

k (<) 


E l = A 
E 0 =0 


Fig. 5.12. Dynamics of a two-level system. 


The stationary probabilities for this system (i.e. the probabilities for r — » oo) have been calculated 
in Chapter 2, and then again in Sec. 4.4. In our current notation (Fig. 12), 


w „(») = ■ 


1 


1 + e 


-AIT 


fij (oo) = 


1 


e A,T + l 


E) = W 0 (od)xO + W 1 (od)xA = 


A 


e AIT + l 


(5.169) 


In order to calculate the conditional probabilities W m (z ) with initial conditions (172) (according to Eq. 
(168), we need all 4 of them, for m, m ’ = 0, 1), we may use master equations (4.100), in our current 
notation reading 
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dW x 

dr 


dW 0 

dr 


= r r w 0 


r^. 


(5.170) 


Since Eq. (170) conserves the total probability, Wo + W\ = 1, only one probability (say, W\) is an 
independent variable, and for it, Eq. (170) gives a simple, linear differential equation, 

dW 

— L = r t -r j.w lt where r E =r T +r i . (5.171) 

dr 

This equation may be readily integrated for an arbitrary initial condition: 

W, (r) = W x (0)e~^ T + W x (oo)(l - e~ r * T ), (5.172) 

where Wi(co) is given by the second of Eqs. (169). (It is straightforward to check that the solution for 
Wo(t) may be presented in the similar form, with the corresponding change of the state index.) Now 
everything is ready to calculate average (. E(t)E(t +z)> using Eq. (168), with f m , m - = E 0 , i. Thanks to our 
(smart :-) choice of energy origin, of 4 terms in the double sum (168), all 3 terms that include at least 
one factor Eq = 0 vanish, and we have only one term left: 


(E(t)E(t + z)) = E l W l ^)E l W l (T) 
A 2 T -r T r 


e AIT +l 


+ 


W x { 0)=1 

1 


e A/r +l 


= Ei W 1 (°°) 
1 - e~ Yj - T 


W { ( 0)e T z T + W l (°o)(l 
A 2 


r r z T 


Wd 0)=1 


( e A/r + l) 2 


A/r„-r E r 


l+e^ 1 e 


(5.173) 


From here and the last of Eqs. (169), the correlation function of energy fluctuations is 65 

K e (r) = (E(t)E(t + r)) = ({E(t) - {E(t))\E(t + r) - (E(t)))) = ( E(t)E(t + r)> - {E(t))(E(t + r)) 

= {E(t)E(t + r)) - (E) 2 = A 2 


A IT 

\ 2 , 2 e -T Y r 

' -e s . 


(5.174) 




Since transition rates Tf and have to obey the detailed balance relation (4.103), r |/r t = exp{A/T}, 
and hence 


A IT 

( e A/r + i ) 2 = k 


ryr, 
/r t + 


l ) 2 


r t r i 

( r t +r i ) 2 


r T F i 


(5.175) 


expression (174) may be presented also in a simpler form: 



(5.176) 


We see that the correlation function of energy decays exponentially with time, with the net rate 
r x , while its variance, equal to A7/-(0), does not depend on the transition rates. Now using the Wiener- 
Khinchin theorem (58) to calculate its spectral density, we get 


65 The transition from the first line of Eq. (174) to its second one uses the fact that the system is stationary, so that 
(E(t + r)> = (E(i)} = (E) = const. 


Energy 
fluctuations 
in two-level 
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Correlation 
function of 
continuous- 
state 
system 



r t r i _r T 

—t—^e 1 


cos cot dr 


a 2 r t r, 

7TYz T 2 + CO 2 


(5.177) 


Such dependence on frequency 66 is very typical for discrete-state systems described by master 
equations. It is interesting that the most widely accepted explanation of the 1 If noise (also called the 
“flicker” or “excess” noise), which was mentioned in Sec. 5, is that it is a result of thermally-activated 
jumps between metastable states of a statistical ensemble of such two-level systems, with an 
exponentially-broad statistical distribution of transition rates rt,f- Such a broad distribution follows 
from the Kramers formula (156), which is approximately valid for lifetimes of states of systems with 
double- well potential profiles (Fig. 13), for a statistical ensemble with a smooth statistical distribution of 
energy gaps A. Such profiles are typical, in particular, for electrons in disordered (amorphous) solid- 
state materials that, indeed, feature high 1 If noise. 



Fig. 5.13. Typical double- 
well potential profile. 


Returning to the Fokker-Planck equation, we may use the evident generalization of Eq. (168) to 
the continuous-variable case: 


(/(/)/(/ + r )) = ^ d 3 qd^ d^q'd 3 p' /(q,p)w(q,p,oo)/(q',p')w(q',p', r), 


(5.178) 


were both probability density distributions are solutions of the equation with the delta-functional initial 
condition 


w(q',p',0) = A(q'-q)A(p'-p). (5.179) 

For the Smoluchowski equation, valid in the high-damping limit, the expressions are similar, albeit with 
a lower dimensionality: 

{/(f)f{t + T)) = \d 3 q\d 3 q' /(q)w(q,co)/(q')w(q',r), (5.180) 

w(q',0) = A(q'-q). (5.181) 


To see this formalism in action, let us use it to find the correlation function K q (r) of a linear 
relaxator, i.e. an overdamped ID harmonic oscillator with mcoo « T). In this limit, the coordinate 
averaged over the heat baths obeys a linear equation, 

ij(q) + /c(q) = 0, (5.182) 

which describes its exponential relaxation from a certain initial condition qo to the equilibrium position 
q = 0, with the reciprocal time constant f = id ij. 


66 Regardless of the physical sense of such function of co, and of whether its maximum is situated at either zero as 
in Eq. (177), or at a finite frequency coo as in Eq. (68), it is often referred to as the Lorentzian (or “Breit-Wigner”) 
line. 
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q)(t) = q 0 e 


-Vt 


(5.183) 


The deterministic equation (182) corresponds to the quadratic potential energy U(q) = xq~/2, so 
that the ID version of the Smoluchowski equation (122) takes the following form: 


dw d / \ d w 

i) — = k — \wq ) + T — -. 
dt dq dq~ 


(5.184) 


It is straightforward to check, by substitution, that this equation, rewritten for function w(q r), with the 
delta- functional initial condition (181), w(q ’,0) = d^q ’ - q), is satisfied by a Gaussian function, 


w(q',T) = 


1 


(2 7r) hl Sq(r) 


exp- 


[ (?'-(?)(o) 2 


2 Sq 2 {z) 


(5.185) 


with its center, (q)(z), moving in accordance with Eq. (183), and the time-dependent variance 

,2V T 


Sq 2 (z) = Sq 2 ( oo)(l-<? 2Fr ]i where Sq 2 ( 00 ) = ^- 


K 


(5.186) 


(As a sanity check, the last equality coincides with the equipartition theorem’s result.) Finally, the first 
probability under the integral in Eq. (180) may be found from Eq. (185) in the limit z — > go (in which 
(q)( z) — > 0), by replacing q ’ for q: 


w (q , 00 ) = ■ 


1 


(27r) U ~ Sq(co) 


expf 


q 


28q 1 ( 00 ) 


(5.187) 


Now, all components of recipe (180) are ready, and we can write it, for / (q) = q, as 


(q(t)q(t + t)} = 
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+00 +00 


2n8q{z)dq{co) 


j clq j clq ' q exp j 


q 


-00 -00 


2Sq 2 (co) 


> q exp- 


Lkziffl) 


2Sq 2 (r) 


(5.188) 


The integral over q ’ may be worked our first, by the replacing that integration variable with (q ” + qe Yv ) 
and hence dq ’ with dq 


q(t)q(t + t)) = 


1 


2n8q{T)Sq{co) 


q cxp( 


q 


2Sq~(cc) 


i +C ° j n ft ^ 

\dq f {q " + qe ~ T r ) exp j - 2 \dq " . (5.189) 


The integral of the first term in parentheses ( q ” + qe Vr ) equals zero (as that of an odd function in 
symmetric integration limits), while that with the second tenn is the standard Gaussian integral, giving 


q(t)q(t + r)) = 


1 


+00 

-T r f 2 


(2^)' 2 Sq(cc) 


] 


q exp< 


q 


dq = -^—e Fr {^ 2 exp {-f-}d£. (5.190) 


28q (qo)J U K 


The last integral 67 is just n 11 12, so that taking into account that for this stationary system 
centered at the coordinate origin, the ensemble average (q) = 0, 68 we finally get a very simple result, 


67 See, e.g., MA Eq. (6.9c). 
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Correlation 
function of 
linear 
relaxator 



(5.191) 


As a sanity check, for r = 0 it yields K C] (0) = ( q 2 ) = TIk, in accordance with Eq. (186). As ris increased 
the correlation function decreases monotonically - see the solid-line sketch in Fig. 8. 


So, the solution of this very simple problem has required straightforward but somewhat bulky 
calculations. On the other hand, the same result may be obtained literally in one line, using the Langevin 
formalism - namely, as the Fourier transform (59) of the spectral density (68) in the corresponding limit 
ma>« T], with Sj(cd) given by Eq. (73): 69 


CO CO rji i ri ii < CO rri 

K (r) = 2\ S a (a>) cos cor da> = 2 f — — —cos an dco = 2 — [ — ^ dd; = —e~ Tr . (5.192) 

i i x k 2 +{ti(o ) 2 * j 0 (t t)-+z- k 

This example illustrates well that for linear systems (and small fluctuations in nonlinear systems) 
the Fangevin approach is usually much simpler that the one based on the Fokker-Planck or 
Smoluchowski equations. However, again, the latter approach is indispensable for the analysis of 
fluctuations of arbitrary intensity in nonlinear systems. 

To conclude this chapter, I have to emphasize again that the Fokker-Plank and Smoluchowski 
equations give a quantitative description of time evolution of nonlinear Brownian systems with finite 
dissipation in the classical limit. The description of quantum properties of such dissipative (“open”) and 
nonlinear quantum systems is more complex, 70 and only a few simple problems of such theory have 
been solved so far, 71 typically using a particular model of the environment, e.g., as a large set of 
harmonic oscillators with different statistical distributions of their parameters, leading to different 
frequency dependence of susceptibility )i co). 


5.10. Exercise problems 

5.1 . Considering the first 30 digits of number zr= 3. 1415 . . . as a statistical ensemble of integers k 
(equal to 3, 1,4, 1, 5,...), calculate 

(i) average (k), and 

(ii) the r.m.s. fluctuation 5k. 

Compare the results with those for an ensemble of completely random integers 0, 1, .,9, and comment. 

5.2 . For a set of N non-interacting Ising “spins” Sj = ± 1, placed into magnetic field h, calculate 
the relative fluctuation of system’s magnetization. 

Hint : The total magnetic moment of an Ising system is assumed to be proportional to the sum 


68 This fact is not in any contradiction with the nonvanishing result (183) which is only valid for a sub-ensemble 
with a certain (deterministic) initial condition q 0 . 

69 The involved table integral may be found, e.g., in MA Eq. (6.1 1). 

70 See, e.g., QM Sec. 7.6. 

71 See, e.g., the solutions of the ID Kramers problem for quantum systems with low damping by A. Caldeira and 
A. Leggett, Phys. Rev. Lett. 46 , 211 (1981), and with high damping by A. Larkin and Yu. Ovchinnikov, JETP 
Lett. 37,382 (1983). 
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S-I' 


j = i 


so that the requested relative fluctuation may be calculated just as 5$T(Sj. 


5.3 . For a field-free, two-site Ising system with energy values E m = -Js\Si, in the thermal 
equilibrium at temperature T, find the variance of energy fluctuations. Explore the low-temperature and 
high-temperature limits of the result. 

5.4 . For the ID, three-site Ising ring with ferromagnetic coupling (and no external field), 
calculate the correlation coefficient (sjSj) for both j = /' and j /'. 

5.5 . Within the framework of Weiss’ molecular-field theory, calculate the variance of spin 
fluctuations in the J-dimensional Ising model. Use the result to derive the conditions of quantitative 
validity of the theory. 

5.6 . Calculate the variance of fluctuations of the energy of a quantum harmonic oscillator of 
frequency co, in thermal equilibrium at temperature T, and express it via the average value of the energy. 


5.1 . Express the r.m.s. fluctuation of the occupancy Nk of a certain energy level Sk by: 

(i) a classical particle, 

(ii) a fermion, and 

(iii) a boson, 

in the thermodynamic equilibrium, via the average occupancy (Nk), and compare the results. 


5.8 .* Starting from the Maxwell distribution of velocities, calculate constant C in the 
(approximate) expression K/>( r) = CN r), for the correlation function of fluctuations of pressure P(t) of 
an ideal gas of N classical particles. Compare the result with that of Problem 3.2, and estimate the 
pressure fluctuation variance. 


Hint : You may like to consider a cylindrically-shaped container of 
volume V = LA (see Fig. on the right) to calculate fluctuations of force 
acting on its plane lid of area A, and then recalculate them into fluctuations 
of pressure P. 





L 


> 


5.9 . Perhaps the simplest model of diffusion is the ID discrete 
random walk : each time interval r, a particle leaps, with equal probability, to any of two neighboring 
sites of a ID lattice with the spatial period a. Prove that particle’s displacement during time interval t » 
r, obeys Eq. (77), and calculate the corresponding diffusion coefficient D. 


5.10 .* Calculate the low-frequency spectral density of current I(t) due to random 
passage of charged particles between two conducting electrodes - see Fig. on the right. 
Assume that the particles are emitted by one of the electrodes at random times, and are fully 
absorbed by the counterpart electrode. 
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5.11 . * Within the rotating- wave approximation (RWA), 72 calculate major statistical properties of 
fluctuations of the phase of classical self-oscillations, at: 

(i) the free run of the oscillator, and 

(ii) its phase locking by an external sinusoidal force, 

assuming that the fluctuations are caused by a weak, broadband noise with spectral density S/(<z>). 

5.12 . Calculate the correlation function of the coordinate of a ID hannonic oscillator with small 
Ohmic damping at thermal equilibrium. 

5.13 . Consider a very long, uniform, two-wire transmission line (see Fig. 
on the right), that allows the propagation of TEM waves with negligible 
attenuation, in thennal equilibrium with the environment at temperature T. Find 
variance {V ) Av of electromagnetic fluctuations of voltage A between the wires 
within a small frequency interval A v. 

Hint: As an E&M reminder, 73 TEM waves propagate with a frequency-independent velocity 
(equal to c if the wires are in vacuum), with voltage V' and current / (see Fig. above) related as 
0(x,t)/I(x,t) = ±Z, where £ is a frequency-independent constant (“wave impedance”). 

5.14 . Now consider a similar line terminated, at one end, with an impedance-matching resistor R 
= Z. Find variance {V ) Av of the voltage across the resistor, and discuss the relation between the result 
and the Nyquist theorem (81). 

Hint : Take into account that resistor with R = Z absorbs incident TEM waves without reflection. 

U{q) 

5.15 . An overdamped classical ID particle escapes from a 
potential well with a smooth bottom, but a sharp edge - see Fig. 
on the right. Find the appropriate modification of the Kramers 
formula (139). 

C 

5.16 . A particle may occupy any of N similar sites. Particle’s interaction with environment 
induces its random, incoherent jumps from the occupied site to any other one with the same rate T. Find 
the correlation function and the spectral density of fluctuations of the instant occupancy n(t) (equal to 
either 1 or 0) of any particular site. 




72 See, e.g., CM Sec. 4.3. 

73 See, e.g., EM Sec. 7.6. 
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Chapter 6. Elements of Kinetics 

This chapter gives a brief introduction to the basic notions of physical kinetics. Its main focus is on the 
Boltzmann equation, especially within the relaxation-time approximation, which allows, in particular, 
an approximate but reasonable and simple description of transport phenomena (such as the electric 
current and thermoelectric effects) in gases, including electron gases in metals and semiconductors. 


6.1. The Liouville theorem 

Physical kinetics is the branch of statistical physics that deals with systems out of 
thennodynamic equilibrium. Major tasks of kinetics include: 

(i) for autonomous systems (those out of external fields): transient processes ( relaxation ) leading 
from an arbitrary initial state of a system to the thermodynamic equilibrium; 

(ii) for systems in time-dependent external fields (say, in a sinusoidal “ac” field): the periodic 
oscillations of system’s parameters; and 

(iii) for systems in time-independent (“dc”) external fields: dc transport effects. 

In the last case, we are dealing with stationary (d/dt = 0 everywhere), but non-equilibrium 
situations, in which the effect of an external field, continuously driving the system out of the 
equilibrium, is balanced by the simultaneous relaxation - the trend toward the equilibrium. Perhaps the 
most important effect of this class is the dc current in conductors, which alone justifies the inclusion of 
the basic notions of kinetics into any set of core physics courses. 

Actually, the reader who has reached this point of the notes, already has a good taste of physical 
kinetics, because the subject of the last part of Chapter 5 was the kinetics of a “Brownian particle”, i.e. 
of a “heavy” system interacting with environment consisting of many “lighter” components. Indeed, the 
equations discussed in that part - whether the Smoluchowski equation (5.122) or the Fokker-Plank 
equation (5.149) - are valid if the environment is in thermodynamic equilibrium, but the system of our 
interest is not necessarily so. As a result, we could use those equations to discuss such non-equilibrium 
phenomena as the Kramers problem for the metastable state lifetime. 

This chapter is devoted to the more traditional subject of kinetics: a system of very many similar 
particles - generally, interacting with each other, but not too strongly, so that the energy of the system 
still may be partitioned into a sum of the components, with the component interactions considered as a 
weak perturbation. Actually, we have already started the job of describing such a system in Sec. 5.8, in 
the course of deriving the Fokker-Planck equation for a single classical particle. Indeed, in the absence 
of particle interactions (i.e. when it is unimportant whether the particle is light or heavy), the probability 
current densities in the coordinate and momentum spaces are given, respectively, by Eqs. (5.142) and 
(5.143), so that the continuity equation (5.140) takes the form 

^ + V,-(wq)+V,-(wp)=0. (6.1) 

If similar particles do not interact, this equation for single-particle probability density w(q, p, t) is valid 
for each of them, and the result of its solution may be used to calculate any average of the system as a 
whole. 
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Let us rewrite Eq. (1) in the Cartesian component form, 


dw 

dt 


- + 


Z 


oq, 


{wqj)+^{wpj) 


dp, 


= 0 . 


(6.2) 


where index j lists all degrees of freedom of the particle, and assume that its motion in an external field 
may be described by a Hamiltonian function &(qj,Pj, t ). Plugging into Eq. (2) the Hamiltonian equations 
of motion: 1 


we get 


q i = Pj = - t — 

dpj dq 


(6.3) 


dw 

~dt 


■ + 


I 


Sqj 


dft 

w 

8 Pj) 


dp, 


dft 

d h) 


= 0 


(6.4) 


At the parentheses’ differentiation, the mixed terms wd 1 /tfdqfip, and wo 1 /tfdpjdq, cancel, and using Eq. 
(3) again, we get the so-called Loiuville theorem 2 



(6.5) 


Since the left-hand part of this equation is just the full derivative of the probability density 
considered as a function of the generalized coordinates qft) of a particle, its generalized momenta 
components Pj(t), and (possibly) time t, the Liouville theorem (5) may be presented in a surprisingly 
simple form: 


dwj qjM) 

dt 


(6.6) 


Physically it means that the probability dW = wcPqd'p to find a Hamiltonian particle in a small volume 
of the coordinate-momentum space [q, p], with the center moving in accordance to the detenninistic law 
(3), does not change with time - see Fig. 1. 



Fig. 6.1. Cartoon representation of the 
Liouville theorem in the 6D space [q, p]. 


1 See, e.g., CM Sec. 10.1. 

2 Actually, this is just one of several theorems bearing the name of J. Liouville (1809-1882). 
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At the first glance, this may not look surprising, because according to the fundamental Einstein 
relation (5.78), one needs non-Hamiltonian forces (such as viscosity) to have diffusion. On the other 
hand, it is striking that the Liouville theorem is valid even for (Hamiltonian) systems with deterministic 
chaos, 3 in which the deterministic trajectories corresponding to slightly different initial conditions 
become increasingly mixed with time. 

For an ideal gas of 3D particles, we may select the usual Cartesian coordinates rj (with j = 1,2, 
3) for the generalized coordinates qj, so that pj become the Cartesian components mvj of the usual 
(linear) momentum, and the elementary volume is just cfrcfp - see Fig. 1. In this case Eqs. (3) are just 


0 = A7 SV V P]=?J- 
m 


(6.7) 


so that the Fiouville theorem may be rewritten as 


dw C-i 

■+Z 


dt 


7=1 


^ dw „ dw ^ 
v. — + ? — 

K *J *PJJ 


= 0 . 


and conveniently presented in the vector form 4 

dw 


— + y-Vw + 7 -V„>v = 0. 

dt p 


(6.8) 


(6.9) 


where I have returned to using unindexed symbol V for the vector differentiation in the coordinate space. 


6.2. The Boltzmann equation 

The situation becomes much more complex if particles interact. Generally, a system of N similar 
particles in 3D space has to be described by probability density w being a function of 6 N + 1 arguments 
(3 N Cartesian coordinates, plus 3 N momentum components, plus time). Analytical or numerical 

23 

solution of any equation describing time evolution of such a function for a typical ensemble of N~ 10 
particles is evidently a hopeless task. Hence, kinetics of realistic ensembles has to rely on making 
reasonable approximations with simplify the situation. 

One of the most useful approximation (sometimes called Stosszahlansatz, German for the 
“collision number assumption”) was suggested by F. Boltzmann for a gas of particles that move freely 
most of the time, but interact during short time intervals, when a particle comes close to either an 
immobile scattering center (say, an impurity in a conductor) or to another particle of the gas. Such a 
brief scattering event changes particle’s momentum, and may be approximately described by the 
addition of a special term (called the scattering integral ) to the right-hand part of Eq. (9): 


( 6 . 10 ) 


while still keeping w a function of only 7 arguments: 3 coordinate components of vector r and 3 
components of momentum p (all of just one particle), plus time t. This is the Boltzmann transport 
equation. 


Boltzmann 

equation 


dw _ „ _ dw | 

h V -VW + 7 -V W = scatterinu 

dt p dt 1 s 


3 See, e.g., CM Sec. 9.3. 

4 From this point on, I return to using the index-free symbol 
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The concrete fonn of the scattering integral depends on the scattering object. If scattering centers 
do not belong to the ensemble under consideration (again, for example, an impurity atom in a conductor 
- see Fig. 2), then the scattering integral may be obtained by an evident generalization of the master 
equation (4.100): 5 


dw 

dt 


scatteering 



|r p ^pW(r,p',0 


-r p ^p,w(r,p,0j, 


( 6 . 11 ) 


where the physical sense of r p ^ p is the rate (i.e. the probability per unit time) for the particle to be 
scattered from the state with momentum p into the state with momentum p ’. 


scattering 

center 



Fig. 6.2. Particle scattering event. 


Most elastic interactions are reciprocal, i.e. obey the following relation (closely related to the 
reversibility of time in Hamiltonian systems): r p _> p = T p _> p , so that Eq. (11) may be rewritten as 6 

scattering = J d V [w( r , P ', 0 - w(r, P , tj\ . (6.12) 

With such scattering integral, Eq. (10) stays linear in w, but becomes an integro-differential equation, 
typically harder to solve than differential equations. 

The equation becomes even more complex if the scattering is due to mutual interaction of the 
particle members of the system (Fig. 3). 



P' 

Fig. 6.3. Particle-particle scattering event. 


5 Note that the master equations ignores possible quantum coherence of different scattering events, described by 
off-diagonal elements of the density matrix, because w represents only the diagonal elements of the matrix. 
However, for ensembles close to thermal equilibrium, this is a reasonable approximation - see Sec. 2.1. 

6 One may wonder whether this approximation may work for Fermi particles, for whom the Pauli principle forbids 
scattering into the already occupied state, so that for scattering p — » p', factor w(r,p,t) in Eq. (12) has to be 
multiplied by the probability [1 - w(r,p ’,/)] that the final state is available. Generally, this is a valid argument, but 
one should notice that if this modification has been done with both terms of Eq. (12), it yields 

^ I scatteering = \ d ' P T ' { W ( r > P , f)[l “ w(l% p, f)] - w{Y , p, t)[l ~ w{Y , p ', f)]} • 

ot 1 J P^P 

Opening both square brackets, we see that the probability density products cancel bringing us back to Eq. (12). 
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In this case, the probability of the scattering event scales as a product of two single-particle 
probabilities, and the simplest fonn of the scattering integral is 7 


dw 

dt 


scatteering 


\d i p'\d 3 


r 

P '->p, p/^p. 


w(r,p',0^(r,p,',0-r 


p->p', p,^p, 


,w(r,p,i)>v(r,p,,0 


.(6.13) 


The integration dimensionality in Eq. (13) takes into account the fact that due to the conservation of the 
total momentum at scattering, 


p + p, =p' + p/, 


(6.14) 


one of the momenta is not an independent argument, so that the integration in Eq. (13) may be restricted 
to a 6D /(-space rather than the 9D one. For the reciprocal interaction, Eq. (13) may also be a bit 
simplified, but it still keeps Eq. (10) a nonlinear integro-differential transport equation, excluding such 
powerful solution methods as the Fourier expansion (which hinges on the linear superposition principle). 

This is why most useful results based on the Boltzmann transport equation hinge on its further 
simplifications, most notably the relaxation-time approximation - RTA for short. 8 This approximation is 
based on noticing that in the absence of spatial gradients (V = 0), and external forces (?= 0), Eq. (10) 
yields 

dw dw i 

~dt = luring’ ( 6 ‘ 15 ) 

so that the thermally-equilibrium probability distribution wo(r,p,f) has to turn any scattering integral into 
zero. Hence at small deviations from the equilibrium, 


w(r, p, t) = w(r, p, t) - w 0 (r , p, t) 0 , 


(6.16) 


the scattering integral should be proportional to the deviation w , and its simplest reasonable model is 

Relaxation- 
time 

approximation 
(RTA) 

where r is a phenomenological constant (which, according to Eq. (15), has to be positive for system’s 
stability) called the relaxation time. Its physical meaning will be more clear in the next section. 



(6.17) 


Boltzmann- 

RTA 

equation 


The relaxation-time approximation is quite reasonable if the angular distribution of the scattering 
rate is dominated by small angles between vectors p and p ’ - as it is, for example, for the Rutherford 
scattering by a Coulomb center. 9 Indeed, in this case the two functions w, participating in Eq. (12) are 
close to each other, so that the loss of the second momentum argument (p ’) is not too essential. 
However, while using the Boltzmann-RTA equation, which results from combining Eqs. (10) and (17), 


— + v-Vw + 'f -V„w = 
dt p 



(6.18) 


7 This was the approximation used by L. Bolt z mann to prove the famous H-theorem, stating that entropy of the 
gas described by Eq. (13) may only grow (or stay constant) in time, dS/dt > 0. Since the model is very 
approximate, that result does not seem too fundamental nowadays, despite all its historic significance. 

8 Sometimes this approximation is called the “BGK model”, after P. Bhatnager, E. Gross, and M. Krook who 
suggested it in 1954. (The same year, a similar model was considered by P. Welander.) 

9 See, e.g., CM Sec. 3.7. 
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the reader should always remember this is just an approximation, sometimes giving completely wrong 
results. For example, it prescribes the same time scale, r, to the relaxation of the net momentum of the 
system, and to its energy relaxation, while in many real systems the latter process (that requires inelastic 
interactions) may be substantially longer. Naturally, in the following sections I will describe only those 
applications of the RTA approximation that give a reasonable description of reality. 


6.3. The Ohm law and the Prude formula 

Despite its shortcomings, Eq. (18) is adequate for quite a few applications. Perhaps the most 
important of them is deriving the Ohm law for dc current is a gas of charged particles, whose only 
important deviation from ideality is the scattering in the fonn of Eq. (17), and hence described, in 
equilibrium, by the equilibrium probability wo of an ideal gas (see Sec. 3.1): 

w°(r, p ,0 = ^j r (M ff )>» ( 6 - 19 ) 

where g is the degeneracy factor (say, g = 2 for electrons due to their spin), and (N(s)) is the average 
occupancy of a quantum state with momentum p, that obeys either the Fenni-Dirac or the Bose-Einstein 
distribution: 


S = S <P)- 

(Up to a point, our calculations will be valid for both statistics, and hence, in the limit ju/T — > -oo, for a 
classical gas as well.) 

Now let a uniform, dc electric field 3 be applied to the gas, exerting force f=q3o n each particle 
with electric charge q. Then the stationary solution to Eq. (18), with 8/d t = 0, should also be stationary 
and spatially-uniform (V = 0), so that this equation is reduced to 

q3 - V p w = . (6.21) 

T 

Let us assume the electric field to be relatively low as well, so that the perturbation w it produces is 
relatively small. (I will quantify this condition later on.) Then in the left-hand side of Eq. (21) we can 
neglect that perturbation, by replacing w with wo, because that side already has a small factor (3). As a 
result, this equation yields 

w = -rq3-V w 0 =-rq3-(v (6-22) 

os 

where the partial derivative sign marks the implied local constancy of parameters // and T, i.e. their 
independence of momentum p. But gradient V p s is nothing else than particle’s velocity v - for a 
quantum particle, its group velocity. 10 (This fact is easy to verify for the isotropic and parabolic 
dispersion law, pertinent to classical particles moving in free space, 


10 See, e.g., QM Sec. 2.1. 
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2 2 , 2,2 

e{p)=P-= p ' +P2+P > 

2m 2m 

Indeed, in this case the Cartesian components of vector V p s are 

ds Pj 


M, 


dp . m 


■ = v 


j’ 


so that V p s = v.) Hence, Eq. (22) may be rewritten as 


dw o 

w = -zq 3 • v 

ds 


(6.23) 


(6.24) 


(6.25) 


Let us use this result to calculate the electric current density j . The contribution of each quantum 
state to the current density is q\w, so that the total density is 


j = J q\wd 3 p = q J v(w 0 + w)d 3 p 


(6.26) 


Sommerfeld 

theory’s 

result 


Since in the equilibrium state (with w = Wo), the current has to be zero, integral of the first tenn in the 
parentheses has to vanish. For the integral of the second term, plugging in Eq. (25), and also using Eq. 
(19), we get 


j = <?\ 

y(3 • v) 

3*0 

l ) 

"-(&J 

v (3 ■ v) 

d(N(sp 

d 2 ppdp^ | , 


(6.27) 


where d 2 p ± is the elementary area of the constant energy surface in the momentum space, while dp | is the 
momentum differential’s component normal to that surface. This result 11 is valid even for particles with 
an arbitrary dispersion law Ap) (that may be rather complicated, for example, for particles moving in 
space-periodic potentials 12 ), and may give, in particular, a fair description of conductivity’s anisotropy 
in crystals. 


For classical particles whose dispersion law is isotropic and parabolic, as in Eq. (23), the 
constant energy surface is a sphere of radius p, so that d p± = p dD. = p sin OdOdcp, while dp = dp. In 
spherical coordinates with the polar axis direction along vector 3 . we get ( 3 -\) = £ vcos 0 . Now 
separating vector v outside the parentheses into a component vcos 6 directed along vector 3 , and two 
perpendicular components, vsinftosrp and vsinftiru/?, we see that the integrals of the last two 
components over angle (p give zero. Hence, as we could expect, in the isotropic case the net current is 
directed along the electric field and obeys the linear Ohm law , 13 


j = o3. 


(6.28) 


with a field-independent electric conductivity 


11 First obtained by A. Sommerfeld in 1927. 

12 See, e.g., QM Secs. 2.7, 2.8, and 3.4. In this case, p should be understood as the quasi-momentum rather than 
genuine momentum. 

13 As Eq. (27) shows, if the dispersion law Ap) is anisotropic, the direction of current density may be different 
from that of the electric field. In this case, conductivity should be described by a tensor q;y, rather than a scalar. 
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a 


2 k 

m * 


— f dcp\ sinft/^cos 2 (A p 2 dp i 

V- n h ) o o o 


e(iv(«))' 


<3^ 


(6.29) 


Since sin6i^<9 is just -d(cos0), the integral over 6 equals (2/3). The integral over d(p is of course just 2 n, 
while that over p may be readily transformed to one over particle’s energy <s(p) = p /2 m: vdp = pdp/m = 

2 2 2 1/2 3 1/2 

ds, so that p dpv = p vds = (2ms)(2s/m) ~ds = (Sms ) “ ds. As a result, the conductivity equals 


a = 


gq 2 x 4 n 
(fnh) 2 3 



(6.30) 


2 

Note that a is proportional to q “ and hence does not depend on the particle charge sign; this is why the 
Hall effect in external magnetic field, which lacks this ambivalence, is typically used to determine the 
charge of current carriers (electrons or holes) in semiconductors. 

So far, the calculations have been valid for any gas (Bose, Fermi, or classical), an arbitrary 
temperature. Let us work out the remaining integral over energy for the most important case of a 
degenerate Fermi (say, electron) gas, with T « £y. 14 As was discussed in Sec. 3.3, in this limit, factor (- 
d(N(s)))lds) is essentially Dirac’s delta- function S(s - sp), so that the conductivity does not depend on 
temperature: 15 


a = 


{2.71 tif 



q 1 * g 

m [2 k tif 



q 2 r g 4 np\ 

m (2 nti) 3 3 


(6.31) 


But the last fraction in this product is just the volume of the Fermi sphere in the momentum space, so 
that the product of the last two fractions is the total number of quantum states filled at T = 0 (per unit 
volume), i.e. the total density n = N/V of electrons in the gas. Hence, Sommerfeld’s result is reduced to 
the Drude formula, 16 


q 2 x 

<j = n , 

m 


(6.32) 


which should be well familiar to the reader from an undergraduate physics course, with x being a scale 
of time intervals between scattering events. 


This calculation poses with an important conceptual question. The very structure of Eq. (30) 
implies that the only quantum states contributing to electric conductance are those where the derivative 
(-d(N(s))/ ds) is significant. At T « sp, these are the states at the very surface of the Fermi sphere’s 


14 Calculations for a classical gas (which are important, in particular, for most plasmas and non-degenerate 
semiconductors) are left for the reader - see the first assigmnent of Problem 2. 

15 At least explicitly, because in some particle collision models, r may be a function of temperature, which levels 
out only at some temperature much lower than s F . 

16 Its was derived in 1900 by P. Drude. Note that Drude also used the same arguments to derive a very simple 
(and very reasonable) approximation for the complex electric conductivity in the ac field of frequency ax. oipo) = 
o(0)/(l - icox), with a(0) given by Eq. (32); sometimes the name “Drude formula” is used for this expression 
rather than for Eq. (32) - see Problem 1. 


Drude 

formula 
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surface. On the other hand, the classical derivation of Eq. (32) involves all electrons. 17 So, what exactly 
electrons are responsible for conductance: all of them, or only those on the Fermi surface? 

For the resolution of this paradox, let us return to Eq. (22) and analyze the physical meaning of 
that result. For that, let us compare it with the following model distribution: 

Wmodel =w 0 (r,p-p,0, (6.33) 

where p is some constant, small vector, which describes a small shift of the unperturbed distribution u’o 
in the momentum space as a whole. Performing the Taylor expansion of Eq. (33) in this small parameter, 
and keeping only two leading tenns, we get 

Wmodel ~ >fi)(UP,0 + Wmodel, ^ mo del = “P ' V p W Q (r, p, 0 . (6.34) 

Comparing the model perturbation with the first form of Eq. (22), we see that they coincide, provided 
that 

p = q£r = ?t . (6.35) 

This means that Eq. (22) describes a small shift of the equilibrium distribution of electrons by qEx (in 
/7-space) along the direction of electric field, 18 and gives the picture of the electron transport in a 
degenerate gas, shown in Fig. 4. 



Fig. 6.4. Filling of momentum states in a 
degenerate electron gas: (a) in the 
absence and (b) in the presence of 
external electric field 3. Arrows show 
representative scattering events. 


At 3 = 0, the system is in equilibrium, so that the quantum states inside the Fermi sphere (p < 
p F ), are occupied, while those outside of it are empty (Fig. 4a). Electron scattering events happen only 
between states within a very thin layer (\p 12m — Sf\ ~ T) at the Fermi surface, because only in this layer 


17 As a reminder, here it is (see also EM Sec. 4.2): Let r be the average time at which scattering causes a particle 
to loose all the deterministic component of its velocity, v drift , provided by electric field £, on the top of electron’s 
random thermal motion (which does not contribute to the net current). Using the 2 nd Newton law to describe 
particle’s acceleration by the field, dwfa$Jdt = q3lm, we get (Vdrift) = rq3/m. Multiplying this result by the particle 
charge q and density n = NIV, we get the Ohm law j = g3, with a given by Eq. (32). 

18 By the way, since the scale of the fastest change of w 0 in the momentum space is of the order of dwjdp = 
( dwolds)(dsldp ) ~ (l/J)v F , the linear approximation (34) is valid if eAr « T/v f, i.e. if e3l « T, where / = v F ris 
called the mean free path. This is the promised quantitative condition of the electric field smallness; since the left- 
hand part of the last inequality is just the average energy given to the particle by the electric field between two 
scattering events, the condition may be interpreted as the smallness of electron gas’ “overheating” by the applied 
field. Flowever, another condition is also necessary - see the last paragraph of this section. 
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the states are partially occupied, so that both components of the product w(r,p,f)[l - ufr.p ’,/)], 
mentioned in Sec. 1, do not vanish. These scattering events, on the average, do not change the 
equilibrium probability distribution, because they are uniformly spread over the Fenni surface. 

In the instant the electric field has been turned on, it starts to accelerate all electrons in its 
direction, i.e. the whole Fermi sphere starts moving in the momentum space, along the field’s direction 
in the real space. For elastic scattering events (with \p’\ = \p\), this creates an addition of occupied states 
at the leading front of the accelerating sphere, and an addition of free states on its trailing edge (Fig. 4b). 
As a result, now there are more scattering events bringing electrons from the leading edge to the trailing 
edge of the sphere than in the opposite direction. This creates the average backflow of states occupancy 
in the momentum space. These two trends eventually cancel each other, and the Fermi sphere 
approaches a stationary (though not equilibrium!) state, with the shift (35) relatively to its thermal- 
equilibrium position. 

Thus Fig. 4b presents a clear answer to the question which of the two different interpretations of 
the Drude fonnula is correct, and due to electrons’ indistinguishability, the answer is: either. On one 
hand, we can look at the electric current at a result of shift (35) of all electrons in the momentum space. 
On the other hand, each filled quantum state deep inside the sphere gives exactly the same contribution 
into the net current density as it did without the field. All these internal contributions to the net current 
cancel each other, so that the applied field changes the situation only at the Fermi surface. Thus it is 
equally legitimate to say that only the surface states are responsible for the nonvanishing net current. 19 

Let me also mention the second paradox related to the Drude formula, which is often 
misunderstood (not only by students :-). As was emphasized above, r is finite even at elastic scattering - 
that by itself does not change the total energy of the electron gas. The question is how can such 
scattering may be responsible for Ohmic resistivity p = 1/cr, and hence for the Joule heat production, 
with power density 'PIV = \ & = pj ? The answer is that the Drude and Sommerfeld formulas describe 
just the “bottleneck” of the Joule heat formation. In the scattering picture (Fig. 4b) the elastically 
scattered electron states are predominantly located above the (shifted) Fenni surface, and eventually 
need to relax onto it via some inelastic process that releases their additional energy in the fonn of heat 
(in solid state materials, described by phonons - see Sec. 2.6). The rate and other features of these 
inelastic phenomena do not participate in the Drude formula directly, but for keeping the theory valid (in 
particular, keeping the probability distribution w close to its equilibrium value wo), their intensity has to 
be sufficient to avoid gas overheating by the applied field. This gives an additional restriction on the 
simple theory described above. In some semiconductors, the charge carrier overheating effects, resulting 
in deviations from the Ohm law, i.e. from the linear relation (28) between j and <£, may be readily 
observed already at rather modest applied electric fields. 


6.4, Electrochemical potential and the drift-diffusion equation 

Now let us generalize our calculation to the case when transport takes place in the presence of a 
time-independent spatial gradient of the probability distribution, Vw ^ 0, caused for example by that of 
the particle concentration n = N/V (and hence, according to Eq. (3.40), of the chemical potential p). 


19 So here, as it frequently happens in physics, formulas (or drawings, such as Fig. 4b) give a more clear and 
unambiguous description of the reality than words - the privilege lacked by many other scientific (and 
“scientific”) disciplines, frequently leading in unending, shallow verbal debates. 
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while still considering temperature T constant. For this generalization, we should just keep the second 
tenn in the left-hand part of Eq. (18). If the gradient of w is sufficiently small, we can repeat arguments 
of the last section and replace w with wo in this tenn as well. With the applied electric field 3 presented 

as (-V0), where (j) is the electrostatic potential, Eq. (25) now becomes 


w = rv 


dwp 

ds 


q'V<f>-Vw 0 


x 

J 


(6.36) 


Since in any of distributions (20), (N(s)) is a function of s and // only in combination (, s - //), it obeys the 
following relation, 


d(N(e)) _ d(N{s)} 
d/u ds 


Using this relation, the gradient of wo oc (N(s)) may be presented as 


20 


so that Eq. (26) becomes 


c)w 

Vw 0 = — —V/j, for T = const, 
da 


w = r^v-(^ + V//)=r^v^V O, 


ds 


ds 


(6.38) 


(6.39) 


where the following sum, 


Electro- 

chemical 

potential 


® = </> + — , 
q 


(6.40) 


is called the electrochemical potential . 21 Now repeating the calculation of the electric current, carried 
out in the last section, we get the following generalization of the Ohm law (28): 

j = cr(- V®)= erf , (6.41) 


where the effective electric field £ is the (minus) gradient of the electrochemical potential, rather of the 
electrostatic potential: 

Effective 
electric 
field 

The physics of this extremely important result 22 may be explained in two ways. First, let us have 
a look at the energy spectrum of a uniform, degenerate Fermi gas confined in a volume of finite size. In 
order to ensure such a confinement, we need a piecewise-constant potential U(r ) - a “hard-wall, flat- 



(6.42) 


20 Since we consider wo as a function of two independent arguments r and p, taking its gradient, i.e. 
differentiation of this function over r, does not involve its differentiation over the kinetic energy s - which is a 
function of p only. 

21 In electronic engineering literature, variable qd) = ju + q(j ) , called the local Fermi level , is more frequently used. 

22 Relation (42) does not include the phenomenological parameter r of the relaxation-time approximation, so that 
it is more general than the RTA. Indeed, Eq. (42) is based on the relation between the second and third terms in 
the left-hand part of the rather general Eq. (10). 
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bottom potential well” - see Fig. 5a. (In a solid conductor, such profde is readily provided by the crystal 
lattice of positively charged ions of the crystal lattice.) The well should be of a sufficient depth Uq > s? 
= jli\ r -- o in order to provide the confinement of the overwhelming majority of the particles, with energies 
below or slightly above the Fermi level &y. This means that there should be a substantial energy gap, 

i// = U 0 -/j»T, (6.43) 

between the Fermi energy of a particle inside the well, and its potential energy outside the well. (The 
latter value is usually called the vacuum level.) The difference defined by Eq. (43) is called the 
workfunction; 23 for most metals, its is between 4 and 5 eV, so that relation y/» T is well fulfilled for 
the room temperatures (T ~ 0.025 eV) - and actually for all temperatures up to material’s evaporation 
point. 




Fig. 6.5. Potential profiles of (a) a single conductor and (b,c) a system 
of two closely located conductors, for two different biasing conditions: 
(b) zero electrostatic field (“flat-band”), and (c) zero voltage V = AO. 


Now let us consider two conductors, with different values of i//, separated by a small gap d - see 
Fig. 5b, c. Panel (b) shows the case when the electric field £ = - Vf in the free-space gap between the 
conductors equals zero, i.e. their electrostatic potentials (f> are equal. 24 If there is an opportunity for 
particles to cross the gap (e.g., by either the thermally-activated hopping over the potential barrier, 
discussed in Secs. 5. 6-5. 7, or quantum-mechanical tunneling through it), there will be an average flux of 
particles from the conductor with the higher Fenni level to that with the lower Fermi level, 25 because the 
chemical equilibrium requires their equality - see Secs. 1.5 and 2.7. If the particles have an electric 
charge (as electrons do), the equilibrium will be automatically achieved by them recharging the effective 
capacitor formed by the conductors, until the electrostatic energy difference qA<f> reaches the value 
reproducing that of the workfunctions (Fig. 5c). According to Eq. (43), at the recharging, sum (y/ + ju) 
of each conductor has to stay constant, so that for the equilibrium potential difference 26 we may write 

qA<f> = A y/ = -A/j . (6.44) 


23 Sometimes also called the “electron affinity”, though the latter term is mostly used for atoms and molecules. 

24 In semiconductor physics and engineering, the situation shown in Fig. 5b is called the flat-band condition, 
because in semiconductors, any electric field at the surface leads to band bending - a gradual spatial change of the 
background potential U 0 and hence of all energy band/gap edges. For a discussion of the band bending and its 
effects on semiconductor device operation, see, e.g., either Chapter 6 in J. Hook and H. Hall, Solid State Physics, 
2 nd ed. Wiley, 1991, or Chapter 3 in S. Sze, Semiconductor Devices, 2 nd ed., Wiley, 2001. 

25 As measured from a common reference value, for example from the vacuum level. 

26 In physics literature, it is usually called the contact potential difference, while in electrochemistry (for which it 
is one of the key notions), the term Volta potential is more common. 
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At this equilibrium, the electric field in the gap between the conductors is 


A d) A u 

= n = n 

d qd 


Yjl. 

•> 

q 


(6.45) 


in Fig. 5c the field is clearly visible as the tilt of the electric potential profile. Comparing Eq. (45) with 
definition (42) of the effective electric field £, we see that the transport equilibrium, i.e. the absence of 
current, is achieved exactly when £ = 0, in accordance with Eq. (41). 


Another interpretation of Eq. (41) may be achieved by modifying Eq. (38) for the particular case 
of a classical gas. Indeed, the gas’ local density n = NtV obeys Eq. (3.32), which may be presented as 


n{ r) = const x exp- 


A 


(6.46) 


Taking the spatial gradient of the both parts of this relation (at constant 7), we get 


V/? = const x — 
T 



„ n -r-, 

Va = -Va, 


(6.47) 


so that V/j= ( Tlri)Vn , and Eq. (41), with a given by Eq. (32), may be recast as 

_ 2 _ f i \ 


at | T 

j = cr(- VO) = - — n -V(j) V// = q—{nq£ -TVn) . 

q 


m 


v 




m 


(6.48) 


Hence the current may be viewed as consisting of two independent parts: one due to the “usual” electric 
field & = -W(j), and another due to the particle diffusion - see Eq. (5.118) and its discussion. This is 

exactly the physics of the “mysterious” term V /u in Eq. (42), though it may be presented in the simple 
form (48) only in the classical limit. 

Besides being very useful for practice, 27 Eq. (48) gives us a pleasant surprise. Namely, plugging 
it into the continuity equation for electric charge, 


8{qn) 

8t 


+ V-j = 0, 


(6.49) 


we get (after the division of all terms by qz/m) the so-called drift-diffusion equation : 28 


— — = V(nVU) + TV 2 n, with U = qtf) . 
t dt 


(6.50) 


Comparing it with Eq. (5.122), we see that the drift-diffusion equation is identical to the Smoluchowski 
equation, 29 if we identify ratio dm with mobility ju m = 1/ //: 


(6-51) 

m rj 


27 In particular, in physics of semiconductor devices, where electrons in the conduction band, and holes in the 
valence band, may be frequently treated as nearly-ideal classical gases. 

28 Sometimes this term is associated with Eq. (52). 

29 And hence, at negligible VC/, identical to the diffusion equation (5.116). 
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and hence the following combination, xT/m, with the diffusion constant D - see (5.78). As a result, Eq. 
(48) is frequently rewritten as an expression for the particle flow density j„ = n\ w = j lq: 

\ n =np m qt- D Vn. (6.52) 

This similarity may look surprising. Indeed, our (or rather Einstein’s :-) treatment of the 
Brownian motion in Chapter 5 was based on a strong hierarchy of the total system, consisting of a large 
“Brownian particle” in an environment of many smaller particles - “molecules”. On the other hand, in 
this chapter we are considering a gas of similar particles. Nevertheless, the equations describing the 
dynamics of their probability distribution, are the same - at least within the framework of the Boltzmann 
transport equation with the relaxation-time approximation (17) of the scattering integral. 

The origin of this similarity is that Eq. (12) is applicable to Brownian particles as well, with each 
“scattering” event being the particle’s hit by a random molecule. Since, due to the mass hierarchy, the 
particle momentum change at each such event is small, the scattering integral has to be local, i.e. depend 
only on w at the same momentum p as the left-hand part of the Boltzmann equation, so that the 
relaxation time approximation (17) is absolutely natural. But the same is true for collisions of similar 
particles, if they are dominated by small-angle scattering, as true, for example, for Coulomb scattering. 30 

Returning to the electric field duality (£ <-> <£), recovered in our analysis, it raises a natural 
question: which of these fields we are speaking about in the everyday and laboratory practice? Upon 
some contemplation, the reader should agree that most of our electric field measurements are done 
indirectly, by measuring corresponding voltages - with voltmeters. A vast majority of these instruments 
belong to the electro dynamic variety that is based on the measurement of a small current flowing 
through the voltmeter. As Eq. (41) shows, electrodynamic voltmeters measure the electrochemical 
potential difference AO. However, there exist a rare breed of electrostatic voltmeters (also called 
“electrometers”) that measure the electrostatic potential difference A<f> between two conductors. One 
way to implement such an instrument is to use a usual, electrodynamic voltmeter, but with the reference 
point set at the flat-band condition (Fig. 5b) between the conductors. This condition may be detected by 
vanishing electric charge on the adjacent surfaces of the conductors, and hence by the absence of its 
modulation in time, caused by a specially arranged periodic variation of the distance between the 
surfaces. Another (less sensitive but also less invasive) way to detect the flat-band condition is to 
measure the voltage at which the force of electrostatic interaction between two conductors, which is 
proportional to A oc (A f) , vanishes. 


6.5. Thermoelectric effects 

Now let us extend our analysis even further, to the effects of a finite (though small) temperature 
gradient. Again, since for any of statistics (20), the average occupancy (N(e)) is a function of just one 
combination of all its arguments, ^ =(s- ju)/T, its partial derivatives obey not only Eq. (37), but also the 
following relation: 

d(N(s)) _ e -ju d{N(e)) _ s -p d(N(s)) 

8T T 2 dd, T dp 


30 See, e.g., CM Sec. 3.7. 
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As a result, Eq. (38) is generalized as 


Vw n = -- 


dw n 


ds 


W/u + ^-^-WT 


giving the following generalization of Eq. (39): 


w = x- 


ds 


qVO + ^-^-VT 


(6.54) 


(6.55) 


Now, repeating the current density calculation, we get a result which is traditionally presented as 

j = cr(- VO) + cr5^(- Vr), (6.56) 

where constant S, called the Seebeck coefficient 31 (or the “thermoelectric power”, or just 
“thermopower”) is defined by the following relation: 

, (<? _ filN(rW 

(6.57) 

Working out this integral for the most important case of a degenerate Fermi gas, with T « Sp, 
we have to be careful, because the center of the sharp peak of the last factor under the integral coincides 
with the zero point of the previous factor, ( s - ju)/T. This uncertainty may be resolved using the 
Sommerfeld expansion formula (3.59). Indeed, for a smooth function X^) defined by Eq. (3.60), so that 
fO) = 0, we may use (3.61) to rewrite the formula as 

f cs/urt-W'S 





d(N(e) 


ds 






ds 


2 \E=M ■ 


(6.58) 


In particular, for integral (57), we may take fs) = (8/ nf) V2 (s - ju)/T. (Evidently, for this function, 
condition /(0) = 0 is satisfied.) Then /(//) = 0, d 2 f/df\,_ rM = 3(8 mju) m /T ~ \^msf m IT, and Eq. (57) yields 


a 5? = 8V T ^ 7t 2 t2 3(8 ms F ) 


, 1/2 


ifTthf 3 6 T 

Comparing the result with Eq. (31), for constant S' we get a simple expression independent of v? 1 


(6.59) 


^ T _c v 
2q s F q 


(6.60) 


where c v = Cy/N is the heat capacity of the gas per unit particle, given by Eq. (3.70). 


31 Named after T. Seebeck who experimentally discovered, in 1821 (independently of J. Peltier), the effect 
expressed by Eq. (62). 

32 Again, such independence infers that Eq. (60) should have a broader validity than in our simple model of an 
isotropic gas. This is indeed the case: at T « s ¥ , this result turns out to be valid for any form of the Fermi surface, 
and for any dispersion law Ap)- 
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In order to understand the physical meaning of the Seebeck coefficient, it is sufficient to consider 
a conductor carrying no current. For this case, Eq. (56) yields 


V(® + 5T) = 0. 


(6.61) 


Seebeck 

effect 


Thus, the temperature gradient creates the oppositely directed gradient of the effective electric 
potential ®, i.e. the effective field £ defined by Eq. (42). This is the Seebeck effect. Figure 6 shows the 
standard way of its measurement, using a usual (electrodynamic) voltmeter that measures the difference 
of potentials ®, and a connection (in this context, called thermocouple) of two different materials, with 
different coefficients S. Integrating Eq. (61) around the loop from points to point B, and neglecting the 
temperature drop across the voltmeter, we get the following simple expression for the thermally-induced 
difference of the electrochemical potential, frequently also called the either the thermoelectric power or 
“thermo e.m.fi”: 


A" 


( B 


A' 


V = ® B - ® A = J V® • dr = — J SVT -dr = -dr-S 2 J VT • Jr + J VT • dr 

A A A' 

= (T" - r) - 5k (r - T") = (£, - 5k ){T - T") . 


VA" 


(6.62) 


(Note that according to Eq. (62), any attempt to measure such voltage across any two points of a uniform 
conductor would give results depending on the voltmeter lead materials, due to the unintentional 
gradient of temperature in them.) 


’If 



Using thermocouples is a popular, inexpensive method of temperature measurement - especially 
in the few-hundred-°C range (where gas- and fluid-based thermometers are not too practicable), if a 
l°C-scale accuracy is sufficient. The “responsivity” (5) - 5)) of a typical popular thermocouple, 
chromel-constantan, 33 is about 70 pV/°C. In order to understand why typical values of S’ are so small, let 
us discuss Seebeck effect’s physics. Superficially, it is very simple: particles, heated by an external 
source, diffuse from it toward the colder parts of the conductor, carrying electrical current with them if 
they are charged. However, this naive argument neglects the fact that at j = 0, there should be no total 
flow of particles. For a more accurate interpretation, note that the Seebeck effect is described by the 
factor ( a - /./)/ T in integral (57), which changes sign at the Fermi surface, i.e. at the same energy where 
the term (-d(N(s))/ds), describing the state availability for transport (due to their intennediate occupancy 


33 Both these materials are alloys, i.e. solid solutions: chromel is 10% chromium in 90% nickel, while constantan 
is 45% nickel and 55% copper. 
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0 < (N(s)) < 1), reaches its peak. The only reason why that integral does not vanish completely, and 
hence S' ^ 0, is the growth of first factor under the integral (which describes the number of available 
quantum states) with s, so the hotter particles (with s> fi) are more numerous and carry more heat then 
the colder ones. 

The Seebeck effect is of course not the only result of temperature gradient; the same diffusion of 
hotter particles also causes a flow of heat from the region of higher T to those with lower T, i.e. the 
effect of thermal conductivity, well known from our everyday practice. The heat (i.e. thermal energy) 
flow density may be calculated similarly to that of the electric current - see Eq. (26), with the natural 
replacement of the electric charge q of each particle with the thermal energy (s - //) of its state: 34 

h =\{£~ju)vwd 3 p. (6.63) 

Again, at equilibrium {w = wq) the heat flow vanishes, so that w may be replaced with its perturbation 
w , which already has been calculated - see Eq. (55). The substitution of that expression into Eq. (63), 
and its transformation exactly similar to the one perform above for the electric current j, yields 35 

j* =on(-v®)+*-(-vr), (6.64) 


with coefficients II and k defined by equalities 



(6.65) 

( 6 . 66 ) 


Besides the missing 
constant II (called 
coefficient: 36 


factor T in the denominator, integral in Eq. (65) is the same as in Eq. (57), so that 
the Peltier coefficient), is simply and fundamentally related to the Seebeck 


n = 57. 


(6.67) 


34 One more way to look at Eq. (63) is as at the difference between the total energy flow density, j e = jevwdfi, and 
the product of a constant (//) by the particle flow density, j„ = J \wd 3 p = j /q. 

35 The expression given by the second term of this relation, jV = -PVT, is much more general than our analysis: 
for small temperature gradients it is valid in virtually any medium - for example, in insulators, where the first 
term of Eq. (64) vanishes. (In the general case, the thermal conductivity k is of course different from that given by 
Eq. (66).) As a result, this relation has its own name - the Fourier law, because it has been first suggested by the 
same universal genius J.-B. J. Fourier - who has not only developed such a key mathematical tool as the Fourier 
series, but also discovered what is now called the greenhouse effect! 

36 The simplicity of this relation (first discovered experimentally in 1854 by W. Thompson, a.k.a. Lord Kelvin) is 
not occasional. This is one of fundamental Onsager reciprocal relations between kinetic coefficients (L. Onsager, 
1931), which are model-independent, i.e. valid within very general assumptions. Unfortunately, I have no time 
left for a discussion of this interesting topic, and have to refer the interested reader, for example, to Sec. 120 in L. 
Landau and E. Lifshitz, Statistical Physics, 3 rd ed., Pergamon, 1980. Note, however, that the range of validity of 
the Onsager relations is still debated - see, e.g., K.-T. Chen and P. Lee, Phys. Rev. B 79, 18 (2009). 
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On the other hand, integral (66) may be readily calculated, for the most important case of a 
degenerate Fenni gas, using the Sommerfeld expansion (58) with f[s) = (8 ms 3 ) i/2 (s - ju) 2 /T, for which 
f[ju) = 0 and d 2 fd£\^ = 2(8m//) 1/2 /7* 2(fms ¥ ) m IT, so that 


gr An re 1 Tl 2(8 ms\) x ' 2 
(2 Khf 3 6 7 


( 6 . 68 ) 


Comparing the result with the first form of Eq. (31), we get the so called Wiedemann-Franz law 37 


(f\ Wiedemann- 

\V- oy ) Franz law 


This relation between the electric conductivity <j and thermal conductivity k is more general 
than our formal derivation might imply. Indeed, it is straightforward to show that the Wiedemann-Franz 
law is also valid for an arbitrary dispersion law anisotropy (i.e. arbitrary Fermi surface shape) and, 
moreover, well beyond the relaxation-time approximation. (For example, it is also valid for scattering 
integral (12) with an arbitrary angular dependence of rate T, provided that scattering is elastic.) 
Experiments show that the law is well obeyed by most metals, but only at relatively low temperatures 7 
« 7b, when the thermal conductance due to electrons is well above the one due to lattice vibrations, i.e. 
phonons - see Sec. 2.6. (Note also that Eq. (69) is not valid for classical gases - see Problem 2.) 

Now let us discuss the less evident, first term of Eq. (64). It describes the so-called Peltier effect, 
which may be measured in the geometry similar to that shown in Fig. 6, but driven by an external 
voltage source - see Fig. 7. 




The voltage drives certain dc current I = jA (where A is conductor’s cross-section area), 
necessarily the same in the whole loop. However, according to Eq. (64), if materials 1 and 2 are 


37 It was named after G. Wiedemann and R. Franz who noticed the constancy of ratio id a for various materials, at 
the same temperature, as early as in 1853. The direct proportionality of the ratio to the absolute temperature was 
noticed by L. Lorenz in 1872. Due to this contribution, the Wiedemann-Franz law is frequently presented as k!o= 
LT, where constant L, called the Lorenz number, in SI units is close to 2.45x1 0" 8 W-Q/K 2 . Theoretically, Eq. (69) 
was derived in 1928 by A. Sommerfeld. 
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different, power H = j\\A of the heat flow is different in two parts of the loop. Indeed, if the whole 
system is kept at the same temperature (VT = 0), integration of the equation over the cross-section yields 


(' d )i,2 _ Uh ) 1,2 _ (°n)i,2 


(6.70) 


This means that in order to sustain the constant temperature, the following power difference, 


Peltier 

effect 


A^ = (n 1 -n 2 )/, 


(6.71) 


has to be extracted from one junction of the two materials, and inserted into another junction. If a 
constant temperature is not maintained, the former junction is heated, while the latter one is cooled (on 
the top of the bulk, Joule heating), thus implementing a thermoelectric heat pump / refrigerator. Such 
refrigerators, with no moving parts and gas/fluid materials, are very convenient for modest (by a few 
tens °C) cooling of relatively small components of various systems - from sensitive radiation detectors 
in spacecraft, all the way to cold drinks in vending machines. It is straightforward to use above fonnulas 
to show that the efficiency of active materials used in such thermoelectric refrigerators may be 
characterized by the following dimensionless figure-of-merit, 


ZT = 



K 


(6.72) 


For the best thermoelectric materials found so far, ZT is in the range from 2 to 3, providing the 
coefficient of performance, defined by Eq. (1.69), of the order of 0.5 - a few times lower than that of 
traditional, mechanical refrigerators. The search for composite materials (including those with 
nanoparticles) with higher values of ZT is one of very active fields of applied solid state physics. 38 

Let me finish this chapter (and this course, and this series :-) by emphasizing again that due to 
time/space restrictions I was able to barely scratch the surface of physical kinetics. 39 


6.6. Exercise problems 

6.1 . Use the relaxation-time approximation of the Boltzmann equation to prove the Drude 
formula for the complex conductivity at frequency co. 


<j(co) = 


0- (O) ; 

1- icor ’ 


where o(0) is the dc conductivity given by Eq. (6.30) of the lecture notes, and give a physical 
interpretation of the formula. 


38 See, e.g., D. Rowe (ed.), Thermoelectrics Handbook: Macro to Nano, CRC Press, 2005. 

39 A much more detailed coverage of this important part of physics may be found, for example, in the textbook by 
L. Pitaevskii and E. Lifshitz, Physical Kinetics, Butterworth-Heinemann, 1981. A detailed discussion of its 
applications to mechanical engineering may be found, e.g., in T. Bergman et al.. Fundamentals of Heat and Mass 
Transfer, 7 th ed., Wiley, 2011. 
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6.2 . Use the variable separation method to calculate the time evolution of the particle density 
distribution in space, provided that at t = 0, the particles are released from their unifonn distribution in a 
wide box of thickness 2a: 

\n 0 , for -a < x < +a, 

n = < 

[0, otherwise. 


6.3 . For the ID version of the diffusion equation (i.e. of the drift-diffusion equation (6.50) with 
rT/m = D, and without the drift- inducing field, VU= 0), 


dn _ d 2 n 
8t dx 2 ’ 


for - oo < x < +oo , 


find an appropriate spatial-temporal Green’s function, and use it to solve the previous problem. 


6.4 .* Calculate the electric conductance of a narrow, unifonn conducting channel between two 
bulk conductors, in the low-voltage and low-temperature limit, neglecting the electron interaction and 
scattering inside the channel. 


6.5 . Calculate the electric conductivity cr, the thermal conductivity k, as well as the 
thermoelectric coefficients S and II, for a classical, ideal gas of electrically charged particles. Compare 
the results with those for the degenerate Fermi gas, derived in the lecture notes. 

6.6 . Derive a partial differential equation describing the time evolution of temperature 
distribution in a medium with negligible thermal expansion and with temperature-independent specific 
heat cv and thermal conductivity k, given by the Fourier law 

j ,=-*vr. 


6.7 . Use the equation derived in the previous problem to calculate the time evolution of 
temperature in the center of a uniform solid sphere of radius R, initially heated to temperature T h and at t 
= 0 placed into a heat bath that keeps its surface at temperature To. 

6.8 . Suggest a reasonable definition of the entropy production rate (per unit volume), and 
calculate this rate for a stationary thermal conduction, assuming that it obeys the Fourier law, in a 
material with negligible thermal expansion. Give a physical interpretation of the result. Does the 
stationary temperature distribution in a sample correspond to the minimum of the total entropy 
production in it? 
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